Remove a document from Mongo DB with Apache Camel - java

My main issue might be not understanding some conventions in the Camel documents.
https://camel.apache.org/components/latest/mongodb-component.html#_delete_operations
They have a camel route commented out, and two Java objects being defined, which are not commented out. What are they trying to indicate? Where are these objects at in a project?
Anyway, I'm subscribed to a JMS queue that I have another camel route publishing to. The message is a JSON string, which I save to a Mongo DB. But what I'd like to do is remove any current documents (based on criteria) and replace it with the new message.
from("jms:topic:orderbook.raw.feed")
.log("JMS Message: ${body}")
.choice()
.when().jsonpath("$.[?(#.type=='partial')]")
// Figure out how to delete the old orderbook from Mongo with a type=T1
.to("mongodb:mongo?database=k2_dev&collection=orderbooks&operation=save");

Does your orderbook have an ID? If so, you can enrich the JSON with an _id field (MongoDB default representation for identifiers) whose value would be that ID. Thus you'll be "upserting" that orderbook.
Obs.: Sure the Camel docs could be better.
But if you really feel you'd have to perform a remove operation before saving an orderbook, another option would be to extract its type from the current JSON string and use it as a filter when removing. Something like:
from("jms:topic:orderbook.raw.feed")
.log("JMS Message: ${body}")
.filter("$.[?(#.type=='partial')]")
.multicast().stopOnException()
.to("direct://orderbook-removal")
.to("direct://orderbook-save")
.end()
;
from("direct://orderbook-removal")
// extract type and set it as the body message. e.g. {"type":"T1"}
.to("mongodb:mongo?database=k2_dev&collection=orderbooks&operation=remove")
;
from("direct://orderbook-save")
.to("mongodb:mongo?database=k2_dev&collection=orderbooks&operation=save")
;
The multicast sends a copy of the message to each destination. So the content won't be affected.

Related

how to accept fixed length message by spring rest api

I need to expose one rest API with spring boot that has to accept message content in fixed-length format and, as is(i.e without disturbing fixed-length message content ), need to put into IBM MQ to pass received fixed-length message to the back end system (IBM mainframe system)via IBM MQ
I would be requesting you to help me with the sample code for this requirement
Fixed-length format messages, for example:
20011228YF2001122814313425 Forest St Marlborough MA017525083828200600
Fixed-length format messages use ordinal positions, which are offsets to identify where fields are within the record. There are no field delimiters. An end-of-record delimiter is required, even for the last record.
Welcome stackoverflow asking question area. As I see this is more into javax.validation.constraints package. In your entity, you do something like
public class Market {
#NotNull
#Size(max=4)
private String marketCode;
// getters/setters
}
So when you try to bind values from #RequestBody Market market, it will throw error regarding the length

Apache camel losing trace id and span id after camel split eip

I need to have trace id and span id available in all my logs. However I am observing that after the first splitter in my camel route, I can no longer see the trace id and span id in my logs.
[traceId: spanId:] INFO ---
Is there any way to enable back the tracing information?
From the Camel Documentation I have tried to start the tracing after the split by using
context.setTracing(true)
But looks like this is not working.
Am I missing anything, please help.
You probably have the traceId and spanId stored in the exchange message headers which are lost after the split.
A solution is to store them in the exchange properties(before the split) which are stored for the entire processing of the exchange(see Passing values between processors in apache camel).
If you are using the Java DSL you can use:
.setProperty("traceId ", constant("traceIdValue"))
.setProperty("spanId", constant("spanIdValue"))
You can use the Simple Expression Language(https://camel.apache.org/manual/latest/simple-language.html) to access the properties after the split using exchangeProperty.property_name.
Example:
.log(LoggingLevel.INFO, "[traceId:${exchangeProperty.traceId} spanId:${exchangeProperty.spanId}]")
When you use split, a new and old exchange will be created and to pass exchange properties downstream, you would need to use an aggregator to do so.
Example:
.split().tokenize(System.lineSeparator()).aggregationStrategy(new YourAggregationStrategyClass())

Apache Beam in Dataflow Large Side Input

This is most similar to this question.
I am creating a pipeline in Dataflow 2.x that takes streaming input from a Pubsub queue. Every single message that comes in needs to be streamed through a very large dataset that comes from Google BigQuery and have all the relevant values attached to it (based on a key) before being written to a database.
The trouble is that the mapping dataset from BigQuery is very large - any attempt to use it as a side input fails with the Dataflow runners throwing the error "java.lang.IllegalArgumentException: ByteString would be too long". I have attempted the following strategies:
1) Side input
As stated,the mapping data is (apparently) too large to do this. If I'm wrong here or there is a work-around for this, please let me know because this would be the simplest solution.
2) Key-Value pair mapping
In this strategy, I read the BigQuery data and Pubsub message data in the first part of the pipeline, then run each through ParDo transformations that change every value in the PCollections to KeyValue pairs. Then, I run a Merge.Flatten transform and a GroupByKey transform to attach the relevant mapping data to each message.
The trouble here is that streaming data requires windowing to be merged with other data, so I have to apply windowing to the large, bounded BigQuery data as well. It also requires that the windowing strategies are the same on both datasets. But no windowing strategy for the bounded data makes sense, and the few windowing attempts I've made simply send all the BQ data in a single window and then never send it again. It needs to be joined with every incoming pubsub message.
3) Calling BQ directly in a ParDo (DoFn)
This seemed like a good idea - have each worker declare a static instance of the map data. If it's not there, then call BigQuery directly to get it. Unfortunately this throws internal errors from BigQuery every time (as in the entire message just says "Internal error"). Filing a support ticket with Google resulted in them telling me that, essentially, "you can't do that".
It seems this task doesn't really fit the "embarrassingly parallelizable" model, so am I barking up the wrong tree here?
EDIT :
Even when using a high memory machine in dataflow and attempting to make the side input into a map view, I get the error java.lang.IllegalArgumentException: ByteString would be too long
Here is an example (psuedo) of the code I'm using:
Pipeline pipeline = Pipeline.create(options);
PCollectionView<Map<String, TableRow>> mapData = pipeline
.apply("ReadMapData", BigQueryIO.read().fromQuery("SELECT whatever FROM ...").usingStandardSql())
.apply("BQToKeyValPairs", ParDo.of(new BQToKeyValueDoFn()))
.apply(View.asMap());
PCollection<PubsubMessage> messages = pipeline.apply(PubsubIO.readMessages()
.fromSubscription(String.format("projects/%1$s/subscriptions/%2$s", projectId, pubsubSubscription)));
messages.apply(ParDo.of(new DoFn<PubsubMessage, TableRow>() {
#ProcessElement
public void processElement(ProcessContext c) {
JSONObject data = new JSONObject(new String(c.element().getPayload()));
String key = getKeyFromData(data);
TableRow sideInputData = c.sideInput(mapData).get(key);
if (sideInputData != null) {
LOG.info("holyWowItWOrked");
c.output(new TableRow());
} else {
LOG.info("noSideInputDataHere");
}
}
}).withSideInputs(mapData));
The pipeline throws the exception and fails before logging anything from within the ParDo.
Stack trace:
java.lang.IllegalArgumentException: ByteString would be too long: 644959474+1551393497
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString.concat(ByteString.java:524)
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString.balancedConcat(ByteString.java:576)
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString.balancedConcat(ByteString.java:575)
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString.balancedConcat(ByteString.java:575)
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString.balancedConcat(ByteString.java:575)
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString.copyFrom(ByteString.java:559)
com.google.cloud.dataflow.worker.repackaged.com.google.protobuf.ByteString$Output.toByteString(ByteString.java:1006)
com.google.cloud.dataflow.worker.WindmillStateInternals$WindmillBag.persistDirectly(WindmillStateInternals.java:575)
com.google.cloud.dataflow.worker.WindmillStateInternals$SimpleWindmillState.persist(WindmillStateInternals.java:320)
com.google.cloud.dataflow.worker.WindmillStateInternals$WindmillCombiningState.persist(WindmillStateInternals.java:951)
com.google.cloud.dataflow.worker.WindmillStateInternals.persist(WindmillStateInternals.java:216)
com.google.cloud.dataflow.worker.StreamingModeExecutionContext$StepContext.flushState(StreamingModeExecutionContext.java:513)
com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:363)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1000)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:133)
com.google.cloud.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:771)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Check out the section called "Pattern: Streaming mode large lookup tables" in Guide to common Cloud Dataflow use-case patterns, Part 2. It might be the only viable solution since your side input doesn't fit into memory.
Description:
A large (in GBs) lookup table must be accurate, and changes often or
does not fit in memory.
Example:
You have point of sale information from a retailer and need to
associate the name of the product item with the data record which
contains the productID. There are hundreds of thousands of items
stored in an external database that can change constantly. Also, all
elements must be processed using the correct value.
Solution:
Use the "Calling external services for data enrichment" pattern
but rather than calling a micro service, call a read-optimized NoSQL
database (such as Cloud Datastore or Cloud Bigtable) directly.
For each value to be looked up, create a Key Value pair using the KV
utility class. Do a GroupByKey to create batches of the same key type
to make the call against the database. In the DoFn, make a call out to
the database for that key and then apply the value to all values by
walking through the iterable. Follow best practices with client
instantiation as described in "Calling external services for data
enrichment".
Other relevant patterns are described in Guide to common Cloud Dataflow use-case patterns, Part 1:
Pattern: Slowly-changing lookup cache
Pattern: Calling external services for data enrichment

Apache camel Composed Message Processor

I am using Apache Camel in my Application. I am trying to use Composed Message Processor. I have exchange whose body contains some URLs to hit and by using split(body(), MyAggregationStrategy()), I am trying to get the data from urls and using Aggregation Strategy want to combine each data. But there is a problem where I am stuck. If there is some invalid url on the first line of the body then it happens that aggregation is working fine but it is not moving to the next processor and if invalid url is anywhere else except first line than it is working fine..
please help,
Here is the code for reference
onException(HttpOperationFailedException.class).handled(true)
.retryAttemptedLogLevel(LoggingLevel.DEBUG)
.maximumRedeliveries(5).redeliveryDelay(3000)
.process(new HttpExceptionProcessor(exceptions));
from("jms:queue:supplier")
.process(
new RequestParserProcessor(payloadDetailsMap,
metaDataDetailsPOJO, routesEndpointNamePOJO))
.choice().when(new AggregateStrategy(metaDataDetailsPOJO))
.to("direct:aggregate").otherwise().to("direct:single");
from("direct:aggregate").process(new SplitBodyProcessor())
.split(body(), new AggregatePayload(aggregatePayload))
.to("direct:aggregatepayloadData").end()
.to("direct:payloadDataAggregated").end();
from("direct:aggregatepayloadData").process(basicProcessor)
.recipientList(header(ApplicationConstants.URL));
from("direct:payloadDataAggregated")
.process(
new AggregateJsonGenerator(aggregatePayload,
payloadDetailsMap, metaDataDetailsPOJO)).
In this code AggregateJsonProcessor is never called if there some invalid url on the first hit..
You probably need to set continue(true) in your OnException code. See here:
http://camel.apache.org/exception-clause.html

Using Olingo v2 Java as client for PATCH to OData v2 service

I'm trying to use Olingo to provide a client for interacting with an OData service (also written in Olingo). I'm trying to send a PATCH. However, the standard validation routines are kicking in and if I do not include those elements of the entity that are marked as non-nullable using the standard Olingo tools, I get an error.
in https://olingo.apache.org/doc/odata2/tutorials/OlingoV2BasicClientSample.html it says:
With an HTTP MERGE/PATCH it is also possible to send only the to be updated data as POST Body and omitting the unchanged data. But this is (currently) not shown within this sample.
Unfortunately I'm not sure how to do this, there, doesn't seem to be anywhere to flag to the EntityProvider.writeEntry method that it is a PATCH not a POST/PUT
EntityProviderWriteProperties properties = EntityProviderWriteProperties
.serviceRoot(rootUri).omitJsonWrapper(true).contentOnly(true)
.build();
// serialize data into ODataResponse object
ODataResponse response = EntityProvider.writeEntry(contentType,
entitySet, data, properties);
At this point in my code I get an error if "data" does not contain an entry for my non-nullable fields. The response also returns null values for all the attributes of the entity that aren't in my "data".
I deal with this by manipulating the response to remove all entries not in my "data" after the "standard" generation, but imagine that there must be a better way, even if I can't see it. Any suggestions on how to deal with this?
You have to create an "ExpandSelectTreeNode" which contains only the name of the selected properties to be serialized.
Assuming that your data is a HashMap with the values you can use following code as an example to start from:
// data = new HashMap<>();
ExpandSelectTreeNode node = ExpandSelectTreeNode.entitySet(entitySet)
.selectedProperties(new ArrayList<String>(data.keySet())).build();
EntityProviderWriteProperties properties = EntityProviderWriteProperties
.serviceRoot(rootUri).omitJsonWrapper(true).contentOnly(true)
.expandSelectTree(node)
.build();
// serialize data into ODataResponse object
ODataResponse response = EntityProvider.writeEntry(contentType,
entitySet, data, properties);
Best Regards
Is the contenttype from the client application/json-patch+json ?

Categories

Resources