Camel File processing

Camel File processing - java

I'm using Camel (2.11.0) to try and achieve the following functionality:
If a file exists at a certain location, copy it to another location and then begin processing it
If no such file exists, then I don't want the file consumer/poller to block; I just want processing to continue to a direct:cleanup route
I only want the file to be polled once!
Here's what I have so far (using Spring XML):
<camelContext id="my-camel-context" xmlns="http://camel.apache.org/schema/spring">
<route id="my-route
<from uri="file:///home/myUser/myApp/fizz?include=buzz_.*txt"/>
<choice>
<when>
<!-- If the body is empty/NULL, then there was no file. Send to cleanup route. -->
<simple>${body} == null</simple>
<to uri="direct:cleanup" />
</when>
<otherwise>
<!-- Otherwise we have a file. Copy it to the parent directory, and then continue processing. -->
<to uri="file:///home/myUser/myApp" />
</otherwise>
</choice>
<!-- We should only get here if a file existed and we've already copied it to the parent directory. -->
<to uri="bean:shouldOnlyGetHereIfFileExists?method=doSomething" />
</route>
<!--
Other routes defined down here, including one with a "direct:cleanup" endpoint.
-->
</camelContext>
With the above configuration, if there is no file at /home/myUser/myApp/fizz, then Camel just waits/blocks until there is one. Instead, I want it to just give up and move on to direct:cleanup.
And if there is a file, I see it getting processed inside the shouldOnlyGetHereIfFileExists bean, but I do not see it getting copied to /home/myUser/myApp; so it's almost as if the <otherwise> element is being skipped/ignored altogether!
Any ideas? Thanks in advance!

Try this setting, and tune your polling interval to suit:
From Camel File Component docs:
sendEmptyMessageWhenIdle
default =false
Camel 2.9: If the polling consumer did not poll any files, you can enable this option to send an empty message (no body) instead.
Regarding writing the file, add a log statement inside the <otherwise> to ensure it's being executed. If so, check file / folder permissions, etc.
Good luck.

One error i faced while I tried using the condition:
<simple>${body} != null</simple>
was it always returns true.
Please go through the below link:
http://camel.465427.n5.nabble.com/choice-when-check-BodyType-null-Body-null-td4259599.html
It may help you.

This is very old, but if anyone finds this, you can poll only once with
"?repeatCount=1"

I know the question was done almost 4 years ago but I had exaclty the same problem yesterday.
So I will let my answer here, maybe it will help another person.
I am using Camel, version 3.10.0
To make it work exactly as described in the question:
If a file exists at a certain location, copy it to another location and then begin processing it
If no such file exists, then I don't want the file consumer/poller to block; I just want processing to continue to a direct:cleanup route
ONLY want the file to be polled once!
Using ${body} == null
The configuration that we need are:
sendEmptyMessageWhenIdle=true //Will send a empty body when Idle
maxMessagesPerPoll=1 //Max files that it will take at once
repeatCount=1 //How many times will execute the Pool (above)
greedy=true // If the last pool executed with files, it will
execute one more time
XML:
<camel:endpoint id="" uri="file:DIRECTORY?sendEmptyMessageWhenIdle=true&initialDelay=100&maxMessagesPerPoll=1&repeatCount=1&greedy=false" />

Related

Camel delay overrides any redeliveryPolicy

Here is a simplified ftp polling mechanism.
<camelContext id="Fetcher" xmlns="http://camel.apache.org/schema/blueprint">
<redeliveryPolicyProfile id="redeliveryPolicy"
redeliveryDelay="10000"
maximumRedeliveries="-1" />
<camel:route id="fetchFiles">
<camel:from uri="ftp://10.20.30.40/From?username=user&password=RAW({{password}})&delay=3000" />
<camel:to uri="log:input?showAll=true&level=INFO"/>
<camel:to uri="file://incomingDirectory" />
<onException redeliveryPolicyRef="msRedeliveryPolicy">
<exception>java.lang.Exception</exception>
<redeliveryPolicy logRetryAttempted="true" retryAttemptedLogLevel="WARN"/>
</onException>
</camel:route>
</camelContext>
What do you think happens on failure? (Delay is 3 seconds, and
redeliveryDelay is 10 seconds.)
Answer: It polls every 3 seconds, forever.
So let's look at the docs. Maybe I need this
"repeatCount (scheduler)"
Specifies a maximum limit of number of fires. So if you set it to 1, the scheduler will only fire once. If you set it to 5, it will only fire five times. A value of zero or negative means fire forever.
Default: 0
Nope, it's not even a valid parameter. So why's it in the docs?
Unknown parameters=[{repeatCount=5}]
Ok, so I suppose every 3 seconds it polls. So how do I tell camel to stop that? Let's try set 'handled' to true?
<onException redeliveryPolicyRef="msRedeliveryPolicy">
<exception>java.lang.Exception</exception>
<redeliveryPolicy logRetryAttempted="true" retryAttemptedLogLevel="WARN"/>
<handled><constant>true</constant></handled>
</onException>
No luck. Still 3 seconds. It's clearly not even getting to the redelivery part.
What's the secret?

The fact is errors happen in from endpoint are not handled by user defined route (i.e. fetchFiles in above setup). So, onException and redeliveryPolicy are not involved as they only affect stuff belongs to user defined route.
To control the behavior of consumer defined in from endpoint, the obvious way is to use the option exist in that component. As suggested by #Screwtape, use backoffErrorThreshold and backoffMultplier for your case.
Why parameter repeatCount exist in doc, but is invalid to use? It probably does not exist in your camel version and Camel document writer forget to mark the first exist version in the doc.

Camel SEDA and VM endpoints not uses all threads

I have the following route...
<route id="VM01_spit_products">
<from uri="direct:processXML" />
<split>
<method ref="CamelSplitOnKey" method="splitIntoBatches" />
<to uri="vm:processXMLSplit" />
</split>
</route>
<route id="VM01_processXML">
<from uri="vm:processXMLSplit?concurrentConsumers=15" />
<bean ref="Builder" method="createXMLFile" />
<to uri="{{ChangeReceiver}}" />
</route>
I was expecting with the use of VM or SEDA to mean that if the spliter is producing 5 messages, then one of the 15 threads I have defined would pick up each of these messages. When I debug into the Builder class, I can see that the messages are being picked up sequentially.
I see the same if I an using VM or SEDA.
Can someone suggest where I'm going wrong?
Notes:
Camel 2.6 due to JDK 1.5
New info.
I've added this code into my Builder.java
SedaEndpoint seda = (SedaEndpoint) camelContext.getEndpoint("seda:processXMLSplit");
int size = seda.getExchanges().size();
System.out.println("size ["+size+"]");
This prints a size of 0 each time.
This makes me think that the split isn't queuing up the messages as I expect.

Even if you have defined your vm consumer to have 15 threads it doesn't affect how your Split is working. By default Split is working sequentially so you must configure your Split to use parallelProcessing in order to get the result you want. See further in Splitter ParallelProcessing. Note also #Itsallas comment that you might need your vm endpoint configured with the same parameters.

How to use sftp in the mule flow after writing a file?

I have an orchestration flow which calls a subflow to write the file and then next flow (via flow ref) to sftp it.
WriteSubflow
<file:outbound-endpoint path="${outputDir}" outputPattern="${fileName}" responseTimeout="10000" doc:name="Write file"/>
<logger message="File ${fileName} written to ${outputDir}" level="INFO" doc:name="End"/>
Then I call a flow (via ref) which kicks off sftp process.
<flow name="sftp">
<file:inbound-endpoint path="${outputDir}" connector-ref="File" responseTimeout="10000" doc:name="File">
<file:filename-regex-filter pattern="${fileName}" caseSensitive="true"/>
</file:inbound-endpoint>
<sftp:outbound-endpoint exchange-pattern="one-way" connector-ref="SFTP" responseTimeout="10000" ref="endpoint" doc:name="SFTP" />
</flow>
The problem is
While the file is being written, the flow executes the logger after the file outbound endpoint and says file is already written, and after a while the fileconnector spits out "Write file to ...". How do i make logger wait for file to be done writing??
The file inbound endpoint in flow sftp above, is executed immiediately and file isnt ready yet. so it throws an exception first saying its expecting a InputStream or byte[] or String but got a ArrayList(which is original payload from the orchestration flow). After this exception is printed, finally the file is ready and the inbound file connector kicks off reads it and sftps it fine. This seems related to above question where I need to somehow make the rest of the flow wait for file writing to finish.
Note: I have to create sftp flow as a flow instead of subflow, because it needs to be a source. I think if I dont create it a flow and have file connector not as a source, it will become outbound connector.
Any help appreciated.

So i finally figured it out, somehow both these questions are answered in one blog post here
http://www.sixtree.com.au/articles/2015/advanced-file-handling-in-mule/
The key for #1 is
<file:connector name="synchronous-file-connector" doc:name="File">
<dispatcher-threading-profile doThreading="false"/>
</file:connector>
For #2 as Ryan mentions above, using mule requester module.

1) Set the Flow's procesingStrategy to synchronous:
<flow name="testFlow" processingStrategy="synchronous">
<poll frequency="10000">
<set-payload value="some test text" />
</poll>
<file:outbound-endpoint path="/Users/ryancarter/Downloads"
outputPattern="test.txt" />
<logger level="ERROR" message="After file..." />
</flow>
2) Not sure I quite understand, but you cant invoke inbound-endpoints via flow-ref so the inbound-endpoint will be ignored and the inbound endpoint will run on its own regardless of the calling flow. If you want to read in the file mid-flow then using the mule-requestor module: http://blogs.mulesoft.com/introducing-the-mule-requester-module/

Issues with Spring Integration and process taking time and pausing

I am looking at some issue that we have in our application. Spring integration is being used to poll a particular directory and then process the files in this directory. It can process 5k 1kb files and sometimes there is a huge pause where the application is doing nothing just sitting idle and then completes the process in 4 minutes. Then the next run will take a bit longer and the one after that takes slightly longer and so on until i restart the application where it goes back to the 4 minutes mark. Has anyone experienced this issue before.
I wrote a standalone version without Spring Integration and dont get the same issue.
I have also below pasted the xml config, just incase i have done something wrong that I can't spot.
Thanks in advance.
<!-- Poll the input file directory for new files. If found, send a Java File object on inputFileChannel -->
<file:inbound-channel-adapter directory="file:${filepath}"
channel="inputFileChannel" filename-regex=".+-OK.xml">
<si:poller fixed-rate="5000" max-messages-per-poll="1" />
</file:inbound-channel-adapter>
<si:channel id="inputFileChannel" />
<!-- Call processFile() and start parsing the XML inside the File -->
<si:service-activator input-channel="inputFileChannel"
method="splitFile" ref="splitFileService">
</si:service-activator>
<!-- Poll the input file directory for new files. If found, send a Java File object on inputFileChannel -->
<file:inbound-channel-adapter directory="file:${direcotrypath}" channel="inputFileRecordChannel" filename-regex=".+-OK.xml">
<si:poller fixed-rate="5000" max-messages-per-poll="250" task-executor="executor" />
</file:inbound-channel-adapter>
<task:executor id="executor" pool-size="8"
queue-capacity="0"
rejection-policy="DISCARD"/>
<si:channel id="inputFileRecordChannel" />
<!-- Call processFile() and start parsing the XML inside the File -->
<si:service-activator input-channel="inputFileRecordChannel"
method="processFile" ref="processedFileService">
</si:service-activator>
<si:channel id="wsRequestsChannel"/>
<!-- Sends messages from wsRequestsChannel to the httpSender, and returns the responses on
wsResponsesChannel. This is used once for each record found in the input file. -->
<int-ws:outbound-gateway uri="#{'http://localhost:'+interfaceService.getWebServiceInternalInterface().getIpPort()+'/ws'}"
message-sender="httpSender"
request-channel="wsRequestsChannel" reply-channel="wsResponsesChannel" mapped-request-headers="soap-header"/>
<!-- Handles the responses from the web service (wsResponsesChannel). Again
this is used once for each response from the web service -->
<si:service-activator input-channel="wsResponsesChannel"
method="handleResponse" ref="responseProcessedFileService">
</si:service-activator>

As I surmised in the comment to your question, the (default) AcceptOnceFileListFilter does not scale well for a large number of files because it performs a linear search over the previously processed files.
We can make some improvements there; I opened a JIRA Issue for that.
However, if you don't need the semantics of that filter (i.e. your flow removes the input file on completion), you can replace it with another filter, such as an AcceptAllFileListFilter.
If you need accept once semantics you will need a more efficient implementation for such a large number of files. But I would warn that when using such a large number of files, if you don't remove them after processing, things are going to slow down anyway, regardless of the filter.

Apache camel ftp component - notification on successful transfer

We are in the process of building a Java application with Camel to transfer files between two FTP locations. Is there a way to get a notification on the successful transfer of a file? We are not allowed to use a JMS solution for building the application.

I hope you could create another route and have seda/vm as the endpoint.This endpoint needs to be called after the ftp endpoint.
<route id="MainRoute">
<from uri="ftp:RemoteLocation"/>
<from uri="seda:Retry"/>
<to uri="seda:MyLog"/>
<!--Your Main Processing logic -->
</route>
<route id="Notification-processor">
<from uri="seda:MyLog"/>
<!--Your Logging/Notification Processing logic -->
</route>
In the above scenario of Notification-processor you can have your custom notificaiton/log activity. This is your custom notification logic. If you need to notify for anomalies you can have a to endpoint in the Notification-processor for sending the notification.
You need to write logic to check if the message is complete if not you can have a bean called in the Notification-processor which can have dynamic route to extract the specific file form the ftp location and reprocess it. Like below
<route id="Notification-processor">
<from uri="seda:MyLog"/>
<!--Anomaly checker -->
<to uri="seda:Retry"/>
<!--Your Logging/Notification Processing logic -->
</route>

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Camel File processing - java

One error i faced while I tried using the condition: <simple>${body} != null</simple> was it always returns true. Please go through the below link: http://camel.465427.n5.nabble.com/choice-when-check-BodyType-null-Body-null-td4259599.html It may help you.

This is very old, but if anyone finds this, you can poll only once with "?repeatCount=1"

Related

Camel delay overrides any redeliveryPolicy

Camel SEDA and VM endpoints not uses all threads

How to use sftp in the mule flow after writing a file?

Issues with Spring Integration and process taking time and pausing

Apache camel ftp component - notification on successful transfer

Categories

Resources