I'm implementing a pipeline to read RabbitMq queue.
I'm having problems when I read it at unbound stream
it is saying that channel is already closed and ack is not sent to rabbitmq and message still on the queue:
WARNING: Failed to finalize Finalization{expiryTime=2020-11-21T19:33:14.909Z, callback=org.apache.beam.sdk.io.Read$UnboundedSourceAsSDFWrapperFn$$Lambda$378/0x00000001007ee440#4ae82af9} for completed bundle CommittedImmutableListBundle{PCollection=Read RabbitMQ queue/Read(RabbitMQSource)/ParDo(UnboundedSourceAsSDFWrapper)/ParMultiDo(UnboundedSourceAsSDFWrapper)/ProcessKeyedElements/SplittableParDoViaKeyedWorkItems.GBKIntoKeyedWorkItems.out [PCollection], key=org.apache.beam.repackaged.direct_java.runners.local.StructuralKey$CoderStructuralKey#3607f949, elements=[ValueInGlobalWindow{value=ComposedKeyedWorkItem{key=[-55, 41, -123, 97, 13, 104, 92, 61, 92, 122, -19, 112, -90, 16, 7, -97, 89, 107, -80, 12, 9, 120, 10, -97, 72, 114, -62, -105, 101, -34, 96, 48, 30, -96, 8, -19, 23, -115, -9, 87, 1, -58, -127, 70, -59, -24, -40, -111, -63, -119, 51, -108, 126, 64, -4, -120, -41, 9, 56, -63, -18, -18, -1, 17, -82, 90, -32, 110, 67, -12, -97, 10, -107, -110, 13, -74, -47, -113, 122, 27, 52, 46, -111, -118, -8, 118, -3, 20, 71, -109, 65, -87, -94, 107, 114, 116, -110, -126, -79, -123, -67, 18, -33, 70, -100, 9, -81, -65, -2, 98, 33, -122, -46, 23, -103, -70, 79, -23, 74, 9, 5, -9, 65, -33, -52, 5, 9, 101], elements=[], timers=[TimerData{timerId=1:1605986594072, timerFamilyId=, namespace=Window(org.apache.beam.sdk.transforms.windowing.GlobalWindow#4958d651), timestamp=2020-11-21T19:23:14.072Z, outputTimestamp=2020-11-21T19:23:14.072Z, domain=PROCESSING_TIME}]}, pane=PaneInfo.NO_FIRING}], minimumTimestamp=-290308-12-21T19:59:05.225Z, synchronizedProcessingOutputWatermark=2020-11-21T19:23:14.757Z}
com.rabbitmq.client.AlreadyClosedException: channel is already closed due to clean channel shutdown; protocol method: #method<channel.close>(reply-code=200, reply-text=OK, class-id=0, method-id=0)
at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:258)
at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:427)
at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:421)
at com.rabbitmq.client.impl.recovery.RecoveryAwareChannelN.basicAck(RecoveryAwareChannelN.java:93)
at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.basicAck(AutorecoveringChannel.java:428)
at org.apache.beam.sdk.io.rabbitmq.RabbitMqIO$RabbitMQCheckpointMark.finalizeCheckpoint(RabbitMqIO.java:433)
at org.apache.beam.runners.direct.EvaluationContext.handleResult(EvaluationContext.java:195)
at org.apache.beam.runners.direct.QuiescenceDriver$TimerIterableCompletionCallback.handleResult(QuiescenceDriver.java:287)
at org.apache.beam.runners.direct.DirectTransformExecutor.finishBundle(DirectTransformExecutor.java:189)
at org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:126)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
BUT
if I include withMaxNumRecords
I receive the message and ack is sent to rabbitmq queue
but it works as bound data
CODE
my code is like below:
Pipeline p = Pipeline.create(options);
PCollection<RabbitMqMessage> messages = p.apply("Read RabbitMQ queue",
RabbitMqIO.read()
.withUri("amqp://guest:guest#localhost:5672")
.withQueue("queue")
//.withMaxNumRecords(1) // TRANFORM BOUND
);
PCollection<TableRow> rows = messages.apply("Transform Json to TableRow",
ParDo.of(new DoFn<RabbitMqMessage, TableRow>() {
#ProcessElement
public void processElement(ProcessContext c) {
ObjectMapper objectMapper = new ObjectMapper();
String jsonInString = new String(c.element().getBody());
LOG.info(jsonInString);
}
}));
rows.apply(
"Write to BigQuery",
BigQueryIO.writeTableRows()
.to("livelo-analytics-dev:cart_idle.cart_idle_process")
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
);
Someone could help with this?
I sent an email to apache dev thread and get an awesome answer from
Boyuan Zhang
that worked as a workaround for me
As a workaround, you can add --experiments=use_deprecated_read when launching your pipeline to bypass the sdf unbounded source wrapper here.
--experiments=use_deprecated_read
Put it as an argument on the command line and worked fine for me
Related
I am trying two different ways to decode Phoenix Session Cookie.
First one is Elixir's interaction shell, and the second one is with Java.
Please see the following examples;
IEx
iex(1)> set_cookie = "SFMyNTY.g3QAAAABbQAAAAtfY3NyZl90b2tlbm0AAAAYZFRuNUtQMkJ5YWtKT1JnWUtCeXhmNmdP.l0T3G-i8I5dMwz7lEZnQAeK_WeqEZTxcDeyNY2poz_M"
"SFMyNTY.g3QAAAABbQAAAAtfY3NyZl90b2tlbm0AAAAYZFRuNUtQMkJ5YWtKT1JnWUtCeXhmNmdP.l0T3G-i8I5dMwz7lEZnQAeK_WeqEZTxcDeyNY2poz_M"
iex(2)> [_, payload, _] = String.split(set_cookie, ".", parts: 3)
["SFMyNTY",
"g3QAAAABbQAAAAtfY3NyZl90b2tlbm0AAAAYZFRuNUtQMkJ5YWtKT1JnWUtCeXhmNmdP",
"l0T3G-i8I5dMwz7lEZnQAeK_WeqEZTxcDeyNY2poz_M"]
iex(3)> {:ok, encoded_term } = Base.url_decode64(payload, padding: false)
{:ok,
<<131, 116, 0, 0, 0, 1, 109, 0, 0, 0, 11, 95, 99, 115, 114, 102, 95, 116, 111,
107, 101, 110, 109, 0, 0, 0, 24, 100, 84, 110, 53, 75, 80, 50, 66, 121, 97,
107, 74, 79, 82, 103, 89, 75, 66, 121, 120, 102, ...>>}
iex(4)> :erlang.binary_to_term(encoded_term)
%{"_csrf_token" => "dTn5KP2ByakJORgYKByxf6gO"}
Java
public static String decodePhoenixSessionCookie(String sessionCookie) {
String payload = sessionCookie.split("\\.")[1];
byte[] encoded_term = Base64.getUrlDecoder().decode(payload.getBytes());
return new String(encoded_term);
}
Java Output
�tm_csrf_tokenmdTn5KP2ByakJORgYKByxf6gO
What I wonder is; with the Java way, I can fully achieve field name and it's value, but some gibberish values come with them.
Do you know what's the reason for this?
Do I have a chance to get clean output like Elixir way in Java way?
I tried to pass filename with unicode characters to my java program in Windows cmd, but the filename I got in program was broken. Those unicode characters were presented as ?, and it threw IOException when I read those file.
However, if I use wildcard like *.txt, it works correctly.
For example, I have a file called [テスト]测试文件1.txt in my directory.
And I wrote a simple java program to show arguments it received, which also prints string bytes.
import java.util.Arrays;
public class FileArg {
public static void main(String[] args) {
for (String fn: args) {
System.out.println(fn);
System.out.println(Arrays.toString(fn.getBytes()));
}
}
}
My Java version is 15, and I have chcp to 65001, also running program with -Dfile.encoding=UTF-8 flag.
Then I run java -Dfile.encoding=UTF-8 FileArg "[テスト]测试文件1.txt" [テスト]测试文件1.txt *.txt.
The filename is broken if I passed them directly, but it works perfectly with wildcard.
The output:
[???]??文件1.txt
[91, 63, 63, 63, 93, 63, 63, -26, -106, -121, -28, -69, -74, 49, 46, 116, 120, 116]
[???]??文件1.txt
[91, 63, 63, 63, 93, 63, 63, -26, -106, -121, -28, -69, -74, 49, 46, 116, 120, 116]
[テスト]测试文件1.txt
[91, -29, -125, -122, -29, -126, -71, -29, -125, -120, 93, -26, -75, -117, -24, -81, -107, -26, -106, -121, -28, -69, -74, 49, 46, 116, 120, 116]
Result:
BTW, my default code page is cp950 (BIG5).
How could I get it work?
I have a ProducerRecord object.
ProducerRecord<String, byte[]> hdr = addHeader.addMDGHeader(record);
I'm trying to write a test that checks a particular header key exists.
If I print hdr.headers().toString() I get the following RecordHeaders(headers = [RecordHeader(key = mdpHeader, value = [123, 34, 83, 101, 113, 117, 101, 110, 99, 101, 78, 111, 34, 58, 48, 44, 34, 84, 101, 109, 112, 108, 97, 116, 101, 115, 34, 58, 91, 93, 125])], isReadOnly = false).
How do I pull out mdpHeader?
The Header.value() method returns byte array byte[], and then you can convert it into string, you can see more examples here
String value = new String(header.value(), StandardCharsets.UTF_8);
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I have a byte array, and I want to compute the MD5 hash in java and C# separately. However, their generate very different result.
below is my C# code:
byte[] input = { 90, 12, 200, 139, 85, 104, 9, 202, 0, 0, 0, 0, 28, 251, 54, 201, 233, 153, 79, 1 };
MD5 md5 = MD5.Create();
byte[] result = md5.ComputeHash(input);
It generate md5 hash: 85,126,37,15,86,254,54,94,243,185,219,84,21,17,192,153,.
and below is the java code:
byte[] input = {90,12,-56,-117,85,104,9,-54,0,0,0,0,28,-5,54,-47,-23,-103,79,1};
byte[] md5 = MessageDigest.getInstance("MD5").digest(input);
and it results in:
-56,-74,-89,-76,9,35,-83,-89,-73,-39,17,83,24,18,-91,-62,
As you can see, the results are quite different. I know c# use unsigned byte, java uses signed byte. There is no way make me believe the results are identical.
Thanks in advance.
Your C# and Java inputs aren't the same.
Let's try to convert the C# input to signed bytes:
byte[] input = { 90, 12, 200, 139, 85, 104, 9, 202, 0, 0, 0, 0, 28, 251, 54, 201, 233, 153, 79, 1 };
sbyte[] signedInput = input.Select(i => unchecked((sbyte)i)).ToArray();
Console.WriteLine(string.Join(", ", signedInput));
This outputs:
90, 12, -56, -117, 85, 104, 9, -54, 0, 0, 0, 0, 28, -5, 54, -55, -23, -103, 79, 1
There's a different byte here, in bold. The Java version contains -47 at this offset.
And just to be sure, we can do a simple check using the Java version's input:
var javaInput = new[] { 90, 12, -56, -117, 85, 104, 9, -54, 0, 0, 0, 0, 28, -5, 54, -47, -23, -103, 79, 1 };
var javaInputUnsigned = javaInput.Select(i => unchecked((byte)i)).ToArray();
var hash = MD5.Create().ComputeHash(javaInputUnsigned).Select(i => unchecked((sbyte)i)).ToArray();
Console.WriteLine(string.Join(", ", hash));
This yields the same result as in the Java version:
-56, -74, -89, -76, 9, 35, -83, -89, -73, -39, 17, 83, 24, 18, -91, -62
While I was testing the send and receive methods which I created for my project I ran into a strange problem.
When I send a certain message using a correlationId that is based on a UUID object, the receiving party gets a slightly modified version of this correlationId (which cannot be deserialised).
On the sending side I do this:
MessageProperties properties = new MessageProperties();
properties.setCorrelationId(MessageSerializer.serialize(UUID.randomUUID().toString()));
On my last test the UUID generated was: "d4170243-9e7e-4c42-9168-f9da4debc5bb"
This generates the following correlationId (when serialized):
[-84, -19, 0, 5, 116, 0, 36, 100, 52, 49, 55, 48, 50, 52, 51, 45, 57, 101, 55, 101, 45, 52, 99, 52, 50, 45, 57, 49, 54, 56, 45, 102, 57, 100, 97, 52, 100, 101, 98, 99, 53, 98, 98]
When I receive the message on the other side this id is slightly changed:
[-17, -65, -67, -17, -65, -67, 0, 5, 116, 0, 36, 100, 52, 49, 55, 48, 50, 52, 51, 45, 57, 101, 55, 101, 45, 52, 99, 52, 50, 45, 57, 49, 54, 56, 45, 102, 57, 100, 97, 52, 100, 101, 98, 99, 53, 98, 98]
When using the RabbitMQ management plugin I noticed that the id already changed upon arrival at the queue.
Tracing my code on the sending side brings me to the send option of the RabbitTemplate class.
RabbitTemplate template = new RabbitTemplate(connection);
template.setExchange("amq.direct");
template.setRoutingKey("some.route");
template.send(message);
But I can't figure out what's causing this problem. I guess it's just me using the correlationId option the wrong way. Could someone help me out?
Appreciate it.
Explanation is the following:
You serialize the UUID string to a byte array
Your serialization prepends non ascii character to this array ([-17, -65, -67, -17, -65, -67, 0, 5, 116, 0, 36,...])
The reference documentation states that the correlation id is a shortstr. RabbitMQ client converts this byte array to a string using
new String(yourArray , "UTF-8").
The non ascii character "corrupt" the conversion from byte[] to string
You can get the same result with the following code:
new String(MessageSerializer.serialize(UUID.randomUUID().toString()) , "UTF-8").getByte("UTF-8")
Which will return:
[-17, -65, -67, -17, -65, -67, 0, 5, 116, 0, 36, 100, 52, 49, 55, 48, 50, 52, 51, 45, 57, 101, 55, 101, 45, 52, 99, 52, 50, 45, 57, 49, 54, 56, 45, 102, 57, 100, 97, 52, 100, 101, 98, 99, 53, 98, 98]