Talend : capture value from row1 and replace it in the entire column - java

I want to take uk from 1st row and replace it in the entire country column without changing the values in zones. I have tried regex expression from expression builder but failed.
COUNTRY
ZONE
UK
12
AU
44
FR
21
GER
20
FR
02

Your job design will look like this
Second , using a tSampleRow you will get the range of lines (in your case you would like first line )
Third , stock your wanted line in a global variable like this
Finally , in the tmap just get your global variable as such
Here is the output (I have 201 lignes i will have 201 UK printed ):
.--------.
|tLogRow_1|
|=------=|
|mystring|
|=------=|
|UK|
|UK |
|UK |
|UK |
|UK |
'--------'
[statistics] disconnected
Job operation ended at 14:00 21/02/2022. [exit code = 0]

Related

Java Parsing a String to Extract Data

I have to write a program but I have no idea where to start. Can anyone help me with an outline of how I should go about it? please excuse my novice level at programming. I have provided the input and output of the program.
The trouble that I'm facing is how do I handle the input text? How should I store the input text to extract the data that I need to produce the output commands? Any guidance would be so helpful.
A little explanation of the input:
The output will start with APPLE1: CT= (whatever number is there for CT in line 4)
The following lines of the output will begin with "APPLES:"
I must include and extract the values for CR, PLANTING and RW in the output.
Wherever there is a non-zero or not null in the DATA portion, it will appear in the output.
When the program reads END, "APP;APPLER:CT=(whatever number);" will be the last two commands
INPUT:
<apple:ct=12;
FARM DATA
INPUT DATA
CT CH CR PLANTING RW DATA
12 YES PG -0 FA=1 R=CODE1 MM2 COA COB CI COC COD
0 0 1 0
COE RN COF COG COH
4 00 0
COI COJ D
0
FA=2 R=CODE2 112 COA COB CI COC COD
0 0 0 0
COE RN COF COG COH
4 00 0
COI COJ D
7
END
OUPUT:
APPLE1:CT=12;
APPLES:CR=PG-0,FA=1,R=CODE1,RW=MM2,COC=1,COE=4;
APPLES:FA=2,R=CODE2,RW=112,COE=4,COI=7;
APP;
APPLER:CT=12;

Predict function R returns 0.0 [duplicate]

I posted earlier today about an error I was getting with using the predict function. I was able to get that corrected, and thought I was on the right path.
I have a number of observations (actuals) and I have a few data points that I want to extrapolate or predict. I used lm to create a model, then I tried to use predict with the actual value that will serve as the predictor input.
This code is all repeated from my previous post, but here it is:
df <- read.table(text = '
Quarter Coupon Total
1 "Dec 06" 25027.072 132450574
2 "Dec 07" 76386.820 194154767
3 "Dec 08" 79622.147 221571135
4 "Dec 09" 74114.416 205880072
5 "Dec 10" 70993.058 188666980
6 "Jun 06" 12048.162 139137919
7 "Jun 07" 46889.369 165276325
8 "Jun 08" 84732.537 207074374
9 "Jun 09" 83240.084 221945162
10 "Jun 10" 81970.143 236954249
11 "Mar 06" 3451.248 116811392
12 "Mar 07" 34201.197 155190418
13 "Mar 08" 73232.900 212492488
14 "Mar 09" 70644.948 203663201
15 "Mar 10" 72314.945 203427892
16 "Mar 11" 88708.663 214061240
17 "Sep 06" 15027.252 121285335
18 "Sep 07" 60228.793 195428991
19 "Sep 08" 85507.062 257651399
20 "Sep 09" 77763.365 215048147
21 "Sep 10" 62259.691 168862119', header=TRUE)
str(df)
'data.frame': 21 obs. of 3 variables:
$ Quarter : Factor w/ 24 levels "Dec 06","Dec 07",..: 1 2 3 4 5 7 8 9 10 11 ...
$ Coupon: num 25027 76387 79622 74114 70993 ...
$ Total: num 132450574 194154767 221571135 205880072 188666980 ...
Code:
model <- lm(df$Total ~ df$Coupon, data=df)
> model
Call:
lm(formula = df$Total ~ df$Coupon)
Coefficients:
(Intercept) df$Coupon
107286259 1349
Predict code (based on previous help):
(These are the predictor values I want to use to get the predicted value)
Quarter = c("Jun 11", "Sep 11", "Dec 11")
Total = c(79037022, 83100656, 104299800)
Coupon = data.frame(Quarter, Total)
Coupon$estimate <- predict(model, newdate = Coupon$Total)
Now, when I run that, I get this error message:
Error in `$<-.data.frame`(`*tmp*`, "estimate", value = c(60980.3823396919, :
replacement has 21 rows, data has 3
My original data frame that I used to build the model had 21 observations in it. I am now trying to predict 3 values based on the model.
I either don't truly understand this function, or have an error in my code.
Help would be appreciated.
Thanks
First, you want to use
model <- lm(Total ~ Coupon, data=df)
not model <-lm(df$Total ~ df$Coupon, data=df).
Second, by saying lm(Total ~ Coupon), you are fitting a model that uses Total as the response variable, with Coupon as the predictor. That is, your model is of the form Total = a + b*Coupon, with a and b the coefficients to be estimated. Note that the response goes on the left side of the ~, and the predictor(s) on the right.
Because of this, when you ask R to give you predicted values for the model, you have to provide a set of new predictor values, ie new values of Coupon, not Total.
Third, judging by your specification of newdata, it looks like you're actually after a model to fit Coupon as a function of Total, not the other way around. To do this:
model <- lm(Coupon ~ Total, data=df)
new.df <- data.frame(Total=c(79037022, 83100656, 104299800))
predict(model, new.df)
Thanks Hong, that was exactly the problem I was running into. The error you get suggests that the number of rows is wrong, but the problem is actually that the model has been trained using a command that ends up with the wrong names for parameters.
This is really a critical detail that is entirely non-obvious for lm and so on. Some of the tutorial make reference to doing lines like lm(olive$Area#olive$Palmitic) - ending up with variable names of olive$Area NOT Area, so creating an entry using anewdata<-data.frame(Palmitic=2) can't then be used. If you use lm(Area#Palmitic,data=olive) then the variable names are right and prediction works.
The real problem is that the error message does not indicate the problem at all:
Warning message: 'anewdata' had 1 rows but variable(s) found to have X
rows
instead of newdata you are using newdate in your predict code, verify once. and just use Coupon$estimate <- predict(model, Coupon)
It will work.
To avoid error, an important point about the new dataset is the name of independent variable. It must be the same as reported in the model. Another way is to nest the two function without creating a new dataset
model <- lm(Coupon ~ Total, data=df)
predict(model, data.frame(Total=c(79037022, 83100656, 104299800)))
Pay attention on the model. The next two commands are similar, but for predict function, the first work the second don't work.
model <- lm(Coupon ~ Total, data=df) #Ok
model <- lm(df$Coupon ~ df$Total) #Ko

Adding a new date format to Natty DateParser

I'm a total newbie when it comes to Natty and Antler. Up to now, Natty has been great and has parsed dates with no problems. Recently we have started to receive a new date and time format which Natty has trouble extracting.
Mon 29 Feb 09:00:00 2016
It cannot extract the year due to it being separated from the rest of the date.
I've been trying to add my own format into DateParser, where it could pick up on this format as it does with any other.
I've made the following changes:
date_time: Added an extra rule called custom_dates which will be the new rule for my format
date_time
: (
(date)=>date (date_time_separator explicit_time)?
| explicit_time (time_date_separator date)?
| custom_dates
) -> ^(DATE_TIME date? explicit_time?)
| relative_time -> ^(DATE_TIME relative_time?)
;
custom_date: My new rule
custom_date
: relaxed_day_of_week WHITE_SPACE relaxed_day_of_month WHITE_SPACE relaxed_month (date_time_separator explicit_time)? relaxed_year
-> ^(EXPLICIT_DATE relaxed_day_of_week relaxed_day_of_month relaxed_month relaxed_year (date_time_separator explicit_time)?)
;
When I try to build Natty with my changes, it just hangs, and never finishes. The output up to that point is:
Decision can match input such as "COMMA WHITE_SPACE INT_00 INT_00" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): com\joestelmach\natty\generated\DateParser.g:444:73:
Decision can match input such as "COMMA WHITE_SPACE INT_00 {INT_13..INT_19, INT_20..INT_23}" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): com\joestelmach\natty\generated\DateParser.g:496:45:
Decision can match input such as "WHITE_SPACE IN {COMMA, WHITE_SPACE}" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): com\joestelmach\natty\generated\DateParser.g:504:77:
Decision can match input such as "WHITE_SPACE IN {COMMA, WHITE_SPACE}" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
Am I possibly going the wrong way about this? I've taken a look at the Natty and ANTLR v3 documentation but there isn't much to go on.
Thanks in advance
EDIT:
As requested in the comments below. I've added in where the first warning occurs. However what I've included above is just a small snapshot of the dozens of warnings that have been in there before I modified any code with my own rules
The first warning appears in the date_time_separator
date_time_separator
: WHITE_SPACE (AT WHITE_SPACE)?
| WHITE_SPACE? COMMA WHITE_SPACE? (AT WHITE_SPACE)?
| T
;
One observation I've made is when I changed my rule to always include the time
custom_date
: relaxed_day_of_week WHITE_SPACE relaxed_day_of_month WHITE_SPACE relaxed_month (date_time_separator explicit_time) relaxed_year
-> ^(EXPLICIT_DATE relaxed_day_of_week relaxed_day_of_month relaxed_month relaxed_year (date_time_separator explicit_time)?)
;
When I compile I receive this error:
error(202): com\joestelmach\natty\generated\DateParser.g:831:3: the decision cannot distinguish between alternative(s) 1,2 for input such as "INT_00 INT_00 INT_00 EOF"
Looking at line 831 is where the explicit_time resides. I cannot find anything on StackOverflow or otherwise as to what this error means. I assume this error means that there is some ambiguity between the two possible routes. However I don't understand why merely adding in my code should cause an error.
explicit_time_hours_minutes returns [String hours, String minutes, String ampm]
: hours (COLON | DOT)? minutes ((COLON | DOT)? seconds)? (WHITE_SPACE (meridian_indicator | (MILITARY_HOUR_SUFFIX | HOUR)))?
{$hours=$hours.text; $minutes=$minutes.text; $ampm=$meridian_indicator.text;}
-> hours minutes seconds? meridian_indicator?
| hours (WHITE_SPACE? meridian_indicator)?
{$hours=$hours.text; $ampm=$meridian_indicator.text;}
-> hours ^(MINUTES_OF_HOUR INT["0"]) meridian_indicator?
;

Use APDU commands to get some information for a card

I have a terminal that has its own API to stablish and send commands between chip and terminal, there is a function that transmits the APDU command and returns the answer in a byte array.
For example, if a want to read the tag 5A (Application PAN), I send the following command:
byte[] byteArrayAPDU = new byte[]{(byte)0x00, (byte)0xCA, (byte)0x00, (byte)0x5A};
int nResult = SmartCardInterface.transmit(nCardHandle, byteArrayAPDU, byteArrayResponse);
The variable byteArrayResponse gets the response to the APDU command.
When I translate the value of byteArrayAPDU to a string of hexadecimal digits, this gives me: 00 CA 00 5A. And the response to that command is 6E 00 (class not supported).
My device works with ISO 7816 as technical specifications. Is the way in which I am sending APDU commands correct? I ask this because I have read that an APDU command must have 5 values at least, but I don't know what to send in the fifth parameter. I don't know what the lenght of the response is.
Can you give an example of how to get the tag 5A or something else in APDU commands?
If the command where correct, in place of where I see 6E 00 at the moment, would I see the information as plain text when cast to a string?
The input and output values that you showed in your question suggest that your use of the method transceive() is correct, i.e. the second argument is a command APDU and the third argument is filled with the response APDU:
resultCode = SmartCardInterface.transmit(cardHandle, commandAPDU, ResponseAPDU);
Your question regarding the format and validity of APDU commands is rather broad. In general, the format of APDUs and a basic set of commands is defined in ISO/IEC 7816-4. Since you tagged the question with emv and mention the application primary account number, you are probably interacting with some form of EMV payment card (e.g. a credit or debit card from one of the major schemes). In that case, you would probably want to study the various specifications for EMV payment systems which define the data structures and application-specific commands for those cards.
Regarding your specific questions:
Do APDUs always consist of at least 5 bytes?
No, certainly not. Command APDUs consist of at least 4 bytes (the header bytes). These are
+-----+-----+-----+-----+
| CLA | INS | P1 | P2 |
+-----+-----+-----+-----+
Such a 4-byte APDU is called "case 1". This means that the command APDU does not contain a data field sent to the card and that the card is not expected to generate a response data field. So the response APDU is expected to only contain a response status word:
+-----+-----+
| SW1 | SW2 |
+-----+-----+
What is the 5th byte of a command APDU?
The 5th byte is a length field (or part of a length field in case of extended length APDUs, which I won't further explain in this post). Depending on the case, this length field may have two meanings:
If the command APDU does not have a data field, that length field indicates the expected length (Ne) of the response data field:
+-----+-----+-----+-----+-----+
| CLA | INS | P1 | P2 | Le |
+-----+-----+-----+-----+-----+
Le = 0x01 .. 0xFF: This means that the expected response data length Ne is 1, 2, ... 255 bytes (i.e. exactly the value of Le).
Le = 0x00: This means that the expected response data length Ne is 256 bytes. This is typically used to instruct the card to give you as much bytes as it has available (up to 256 bytes). So even if Le is set to 0x00, you won't always get exactly 256 bytes from the card.
If the command APDU itself has a data field, that length field indicates the length (Nc) of the command data field:
+-----+-----+-----+-----+-----+-----------------+
| CLA | INS | P1 | P2 | Lc | DATA (Nc bytes) |
+-----+-----+-----+-----+-----+-----------------+
Lc = 0x01 .. 0xFF: This means that the command data length Nc is 1, 2, ... 255 bytes (i.e. exactly the value of Lc).
Lc = 0x00: This is used to indicate an extended length APDU.
If there is a command data field and the command is expected to generate response data, that command APDU may again be followed by an Le field:
+-----+-----+-----+-----+-----+-----------------+-----+
| CLA | INS | P1 | P2 | Lc | DATA (Nc bytes) | Le |
+-----+-----+-----+-----+-----+-----------------+-----+
Is the command 00 CA 00 5A correct?
Probably not, for several reasons:
Since you expect the card to deliver a response data field (i.e. the data object 0x5A), you need to specify an Le field. Hence, a valid format would be
+------+------+------+------+------+
| CLA | INS | P1 | P2 | Le |
+------+------+------+------+------+
| 0x00 | 0xCA | 0x00 | 0x5A | 0x00 |
+------+------+------+------+------+
You receive the status word 6E 00 in response to the command. The meaning of this status word is "class not supported". This indicates that commands with the CLA byte set to 0x00 are not supported in the current state. With some cards this also simply means that this combination of CLA and INS (00 CA) is not supported, eventhough this contradicts the definition in ISO/IEC 7816-4.
Overall, you can assume that your card does not support this command in its current execution state.
Assuming you are interacting with an EMV payment card, you typically need to select an application first. Your question does not indicate if you do this already, so I assume, you don't do this right now. Selecting an application is done by sending a SELECT (by AID) command:
+------+------+------+------+------+-----------------+------+
| CLA | INS | P1 | P2 | Le | DATA | Le |
+------+------+------+------+------+-----------------+------+
| 0x00 | 0xA4 | 0x04 | 0x00 | 0xXX | Application AID | 0x00 |
+------+------+------+------+------+-----------------+------+
The value of the application AID, of course, depends on the card application and may be obtained by following the discovery procedures defined in the EMV specifications.
Even after application selection, the GET DATA APDU command for EMV applications is defined in the proprietary class. Consequently, the CLA byte must be set to 0x80:
+------+------+------+------+------+
| CLA | INS | P1 | P2 | Le |
+------+------+------+------+------+
| 0x80 | 0xCA | 0x00 | 0x5A | 0x00 |
+------+------+------+------+------+
Finally, even then, I'm not aware of any schemes where cards would allow you to retrieve the PAN through a GET DATA command. Usually, the PAN is only accessible through file/record based access. Since you did not reveal the specific type/brand of your card, it's impossible to tell what your card may or may not actually support.
At Start
Standard ISO 7816 includes several parts.
When terminal device vendors noticed about ISO 7816 they just confirm that the common Physical characteristics (Part 1), Dimension and Contacts (Part 2) and Transmission protocol (Part 3) were applied to the device reader.
APDU commands and responses defined in ISO 7816 Part 4 (and few other parts also) are generic definition and might not fully supported by your smartcard.
You need to learn about the card-terminal interaction layers related to your card type:
EMV is the customized version of ISO 7816 for Payment cards.
Global Card Brands used own customized specifications based on EMV and ISO 7816. For sample Amex "AEIPS", Diners "D-PAS", MasterCard "M/Chip", Visa "VIS", etc. They are almost the same with small differences related to the supported Commands, flows and list of Tags.
Unfortunately most of payment cards are not supposed to return Tag 0x5A value with GET DATA APDU command. Usually you need to follow payment procedure. At least SELECT card application and READ Tag Values from SFI card records.
According to EMV GET DATA P1 P2 values should be used for Tags 0x9F36, 0x9F13, 0x9F17, or 0x9F4F.
Answering your questions:
What to send in the fifth parameter? What is the length of the response?
Fifth byte known as "Le" - Length of Expected Data. You can try to use Le = "00".
If APDU command supported by card you may get SW1SW2 as 0x"6Cxx" where xx is the hexadecimal length of the requested data. When you can repeat same command with correct Le value.
For sample, to read PIN Counter
Get Data (Tag = '9F 17')
Request : 80 CA 9F 17 00
Response: 6C 04
SW1 SW2: 6C 04 (SW_Warning Wrong length(Le))
Get Data (Tag = '9F 17')
Request : 80 CA 9F 17 04
Response: 9F 17 01 00 90 00
Data : 9F 17 01 03 // Tag + Length + Value
Tag 9F 17: Personal Identification Number (PIN) Try Counter : 03
SW1 SW2 : 90 00 (SW_OK)
If the command where satisfactory in place of see 6E 00 at the moment of cast the answer to string I would see the information as plain text?
APDU commands and responses used BYTE encoding. According to provided terminal API example you will get Array of Bytes.
As developer you can transform bytes into desired format or use it as-is. Please keep in mind that according to EMV specifications the formats of Tags data can be variable:
HEX (or binary) for sample for numeric Tags like amounts;
BCD for sample for date/time or some numbers like currency. PAN also BCD encoder;
Strings in different charsets (ASCII, Unicode, ...) for sample for Cardholder Name, Application Name.
etc.
Tag 0x5A - Application Primary Account Number (PAN) encoded as BCD and can be padded with 0xF in case odd PAN length.
Just answering to how READ your specific tag data since APDU and application State behavior is already answered.
After you SELECT application, you can initiate a GET PROCESSING OPTIONS. This is the actual start of the transaction. Here you will be returned a tag named AFL (application file locator). You need to parse this element and do multiple READ RECORDS till you find the data.
AFL is a set of four byte data( If you have two sets of SFI, there will be eight byte data).
First byte denote the SFI(5 most significant bytes is the input to P2
of READ RECORD). Second byte denotes the first record to read( input
to P1 of READ RECORD). Third byte denotes the last record to read.(
you need to loop READ RECORD this many times) The fourth byte donotes
the number of records involved in offline data authentication.
As you parse through, you will find the your required data. In case you are not sure how to parse, copy the hex data an try it here

I have a csv file with two different formats of date value, want them to be in any single format

My CSV file looks like this
id date
1602 11/23/2015 14:10
1602 11/23/2015 22:45
1602 18/10/2011 09:19:46 AM
1702 18/10/2011 09:07:33 AM
1863 18/10/2011 09:07:35 AM
1436 18/10/2011 09:07:36 AM
I'm looking for output like
id date
1602 11/23/2015 14:10
1602 11/23/2015 22:45
1602 10/18/2011 09:19:46 AM
1702 10/18/2011 09:07:33 AM
1863 10/18/2011 09:07:35 AM
1436 10/18/2011 09:07:36 AM
I'm not sure why you are making this so much more difficult than it is. It seems that all you are actually doing is (1) getting rid of the first line of the CSV file; (2) tossing out any quotation marks; (3) mapping sets of spaces/tabs into single spaces; and (4) mapping newlines to spaces.
SO . . . how about the following? (I assume the data comes from standard input.)
sed 1d | tr -d /\"/ | tr -s "/[ \010\012]/ /"
The 'sed' deletes the first line; the first 'tr' strips quotation marks; the second 'tr' maps runs of spaces, tabs, and/or newlines to single spaces (\010 and \012 are the octal codes for ASCII TAB and ASCII NL, respectively).

Categories

Resources