Origin Destination matrix with Open Trip Planner scripting in Jython / Python - java

I'm trying to write a Python script to build an Origin Destination matrix using Open Trip Planner (OTP) but I'm very new to Python and OTP.
I am trying to use OTP scripting in Jython / Python to build an origin-destination matrix with travel times between pairs of locations. In short, the idea is to launch a Jython jar file to call the test.py python script but I'm struggling to get the the python script to do what I want.
A light and simple reproducible example is provided here And here is the python Script I've tried.
Python Script
#!/usr/bin/jython
from org.opentripplanner.scripting.api import *
# Instantiate an OtpsEntryPoint
otp = OtpsEntryPoint.fromArgs(['--graphs', 'C:/Users/rafa/Desktop/jython_portland',
'--router', 'portland'])
# Get the default router
router = otp.getRouter('portland')
# Create a default request for a given time
req = otp.createRequest()
req.setDateTime(2015, 9, 15, 10, 00, 00)
req.setMaxTimeSec(1800)
req.setModes('WALK,BUS,RAIL')
# Read Points of Origin
points = otp.loadCSVPopulation('points.csv', 'Y', 'X')
# Read Points of Destination
dests = otp.loadCSVPopulation('points.csv', 'Y', 'X')
# Create a CSV output
matrixCsv = otp.createCSVOutput()
matrixCsv.setHeader([ 'Origin', 'Destination', 'min_time' ])
# Start Loop
for origin in points:
print "Processing: ", origin
req.setOrigin(origin)
spt = router.plan(req)
if spt is None: continue
# Evaluate the SPT for all points
result = spt.eval(dests)
# Find the time to other points
if len(result) == 0: minTime = -1
else: minTime = min([ r.getTime() for r in result ])
# Add a new row of result in the CSV output
matrixCsv.addRow([ origin.getStringData('GEOID'), r.getIndividual().getStringData('GEOID'), minTime ])
# Save the result
matrixCsv.save('traveltime_matrix.csv')
The output should look something like this:
GEOID GEOID travel_time
1 1 0
1 2 7
1 3 6
2 1 10
2 2 0
2 3 12
3 1 5
3 2 10
3 3 0
ps. I've tried to create a new tag opentripplanner in this question but I don't have enough reputation for doing that.

Laurent Grégoire has kindly answered the quesiton on Github, so I reproduce here his solution.
This code works but still it would take a long time to compute large OD matrices (say more than 1 million pairs). Hence, any alternative answers that improve the speed/efficiency of the code would be welcomed!
#!/usr/bin/jython
from org.opentripplanner.scripting.api import OtpsEntryPoint
# Instantiate an OtpsEntryPoint
otp = OtpsEntryPoint.fromArgs(['--graphs', '.',
'--router', 'portland'])
# Start timing the code
import time
start_time = time.time()
# Get the default router
# Could also be called: router = otp.getRouter('paris')
router = otp.getRouter('portland')
# Create a default request for a given time
req = otp.createRequest()
req.setDateTime(2015, 9, 15, 10, 00, 00)
req.setMaxTimeSec(7200)
req.setModes('WALK,BUS,RAIL')
# The file points.csv contains the columns GEOID, X and Y.
points = otp.loadCSVPopulation('points.csv', 'Y', 'X')
dests = otp.loadCSVPopulation('points.csv', 'Y', 'X')
# Create a CSV output
matrixCsv = otp.createCSVOutput()
matrixCsv.setHeader([ 'Origin', 'Destination', 'Walk_distance', 'Travel_time' ])
# Start Loop
for origin in points:
print "Processing origin: ", origin
req.setOrigin(origin)
spt = router.plan(req)
if spt is None: continue
# Evaluate the SPT for all points
result = spt.eval(dests)
# Add a new row of result in the CSV output
for r in result:
matrixCsv.addRow([ origin.getStringData('GEOID'), r.getIndividual().getStringData('GEOID'), r.getWalkDistance() , r.getTime()])
# Save the result
matrixCsv.save('traveltime_matrix.csv')
# Stop timing the code
print("Elapsed time was %g seconds" % (time.time() - start_time))

Related

Java Parsing a String to Extract Data

I have to write a program but I have no idea where to start. Can anyone help me with an outline of how I should go about it? please excuse my novice level at programming. I have provided the input and output of the program.
The trouble that I'm facing is how do I handle the input text? How should I store the input text to extract the data that I need to produce the output commands? Any guidance would be so helpful.
A little explanation of the input:
The output will start with APPLE1: CT= (whatever number is there for CT in line 4)
The following lines of the output will begin with "APPLES:"
I must include and extract the values for CR, PLANTING and RW in the output.
Wherever there is a non-zero or not null in the DATA portion, it will appear in the output.
When the program reads END, "APP;APPLER:CT=(whatever number);" will be the last two commands
INPUT:
<apple:ct=12;
FARM DATA
INPUT DATA
CT CH CR PLANTING RW DATA
12 YES PG -0 FA=1 R=CODE1 MM2 COA COB CI COC COD
0 0 1 0
COE RN COF COG COH
4 00 0
COI COJ D
0
FA=2 R=CODE2 112 COA COB CI COC COD
0 0 0 0
COE RN COF COG COH
4 00 0
COI COJ D
7
END
OUPUT:
APPLE1:CT=12;
APPLES:CR=PG-0,FA=1,R=CODE1,RW=MM2,COC=1,COE=4;
APPLES:FA=2,R=CODE2,RW=112,COE=4,COI=7;
APP;
APPLER:CT=12;

How to compute pseudorange from the parameters fetched via Google GNSSLogger?

The official GNSS raw measurements fetched via GNSS logger app provides the following parameters :
TimeNanos
LeapSecond
TimeUncertaintyNanos
FullBiasNanos
BiasNanos
BiasUncertaintyNanos
DriftNanosPerSecond
DriftUncertaintyNanosPerSecond HardwareClockDiscontinuityCount
Svid
TimeOffsetNanos
State
ReceivedSvTimeNanos
ReceivedSvTimeUncertaintyNanos
Cn0DbHz
PseudorangeRateMetersPerSecond
PseudorangeRateUncertaintyMetersPerSecond
I'm looking for the raw pseudorange measurements PR from the above data. A little help?
Reference 1: https://github.com/google/gps-measurement-tools
Reference 2 : https://developer.android.com/guide/topics/sensors/gnss
Pseudorange[m] = (AverageTravelTime[s] + delta_t[s]) * speedOfLight[m/s]
where: m - meters, s - seconds.
Try this way:
Select satellites from one constellation (at first try with GPS).
Chose max value of ReceivedSvTimeNanos.
Calculate delta_t for each satellite as max ReceivedSvTimeNanos minus current ReceivedSvTimeNanos(delta_t = maxRst - curRst).
Average travel time is 70 milliseconds, speed of light 299792458 m/s. use it for calculation.
Don't forget to convert all values to the same units.
For details refer to this pdf and UserPositionVelocityWeightedLeastSquare class
Unfortunately Android doesn't provide pseudorange directly from the API - you have to calculate this yourself.
The EU GSA has a great document here that explains in detail how to use GNSS raw measurements in section 2.4:
https://www.gsa.europa.eu/system/files/reports/gnss_raw_measurement_web_0.pdf
Specifically, section 2.4.2 explains how to calculate pseudorange from the data given by the Android APIs. It's literally pages of text, so I won't copy the whole thing in-line here, but here's the Example 1 they share for a Matlab code snippet to compute the pseudorange for Galileo, GPS and BeiDou signals when the time-of-week is encoded:
% Select GPS + GAL TOW decoded (state bit 3 enabled)
pos = find( (gnss.Const == 1 | gnss.Const == 6) & bitand(gnss.State,2^3);
% Generate the measured time in full GNSS time
tRx_GNSS = gnss.timeNano(pos) - (gnss.FullBiasNano(1) + gnss.BiasNano(1));
% Change the valid range from full GNSS to TOW
tRx = mod(tRx_GNSS(pos),WEEKSEC*1e9);
% Generate the satellite time
tTx = gnss.ReceivedSvTime(pos) + gnss.TimeOffsetNano(pos);
% Generate the pseudorange
prMilliSeconds = (tRx - tTx );
pr = prMilliSeconds *Constant.C*1e-9;

Converting Python Function to Hive UDAF

How do I convert the following Python function, longToDigitArray to a HiveQL UDAF? I am not familiar with Java.
# convert source (e.g. 2305843012434919424, into a list of source flags)
# Desired behavior:
# Input: longToDigitArray(2305843012434919424)
# Output: [31, 32, 62]
def longToDigitArray(x):
a=[]
i=1
try:
x=long(x)
except:
return(a)
while (x!=0):
if (x & 1): # "Bitwise AND" &: Returns a 1 in each bit position for which the corresponding bits of both operands are 1's.
a.append(i)
x = (x >> 1) # bitwise right-shift 1
i = i+1
return(a)
Any insight is appreciated.

Predict function R returns 0.0 [duplicate]

I posted earlier today about an error I was getting with using the predict function. I was able to get that corrected, and thought I was on the right path.
I have a number of observations (actuals) and I have a few data points that I want to extrapolate or predict. I used lm to create a model, then I tried to use predict with the actual value that will serve as the predictor input.
This code is all repeated from my previous post, but here it is:
df <- read.table(text = '
Quarter Coupon Total
1 "Dec 06" 25027.072 132450574
2 "Dec 07" 76386.820 194154767
3 "Dec 08" 79622.147 221571135
4 "Dec 09" 74114.416 205880072
5 "Dec 10" 70993.058 188666980
6 "Jun 06" 12048.162 139137919
7 "Jun 07" 46889.369 165276325
8 "Jun 08" 84732.537 207074374
9 "Jun 09" 83240.084 221945162
10 "Jun 10" 81970.143 236954249
11 "Mar 06" 3451.248 116811392
12 "Mar 07" 34201.197 155190418
13 "Mar 08" 73232.900 212492488
14 "Mar 09" 70644.948 203663201
15 "Mar 10" 72314.945 203427892
16 "Mar 11" 88708.663 214061240
17 "Sep 06" 15027.252 121285335
18 "Sep 07" 60228.793 195428991
19 "Sep 08" 85507.062 257651399
20 "Sep 09" 77763.365 215048147
21 "Sep 10" 62259.691 168862119', header=TRUE)
str(df)
'data.frame': 21 obs. of 3 variables:
$ Quarter : Factor w/ 24 levels "Dec 06","Dec 07",..: 1 2 3 4 5 7 8 9 10 11 ...
$ Coupon: num 25027 76387 79622 74114 70993 ...
$ Total: num 132450574 194154767 221571135 205880072 188666980 ...
Code:
model <- lm(df$Total ~ df$Coupon, data=df)
> model
Call:
lm(formula = df$Total ~ df$Coupon)
Coefficients:
(Intercept) df$Coupon
107286259 1349
Predict code (based on previous help):
(These are the predictor values I want to use to get the predicted value)
Quarter = c("Jun 11", "Sep 11", "Dec 11")
Total = c(79037022, 83100656, 104299800)
Coupon = data.frame(Quarter, Total)
Coupon$estimate <- predict(model, newdate = Coupon$Total)
Now, when I run that, I get this error message:
Error in `$<-.data.frame`(`*tmp*`, "estimate", value = c(60980.3823396919, :
replacement has 21 rows, data has 3
My original data frame that I used to build the model had 21 observations in it. I am now trying to predict 3 values based on the model.
I either don't truly understand this function, or have an error in my code.
Help would be appreciated.
Thanks
First, you want to use
model <- lm(Total ~ Coupon, data=df)
not model <-lm(df$Total ~ df$Coupon, data=df).
Second, by saying lm(Total ~ Coupon), you are fitting a model that uses Total as the response variable, with Coupon as the predictor. That is, your model is of the form Total = a + b*Coupon, with a and b the coefficients to be estimated. Note that the response goes on the left side of the ~, and the predictor(s) on the right.
Because of this, when you ask R to give you predicted values for the model, you have to provide a set of new predictor values, ie new values of Coupon, not Total.
Third, judging by your specification of newdata, it looks like you're actually after a model to fit Coupon as a function of Total, not the other way around. To do this:
model <- lm(Coupon ~ Total, data=df)
new.df <- data.frame(Total=c(79037022, 83100656, 104299800))
predict(model, new.df)
Thanks Hong, that was exactly the problem I was running into. The error you get suggests that the number of rows is wrong, but the problem is actually that the model has been trained using a command that ends up with the wrong names for parameters.
This is really a critical detail that is entirely non-obvious for lm and so on. Some of the tutorial make reference to doing lines like lm(olive$Area#olive$Palmitic) - ending up with variable names of olive$Area NOT Area, so creating an entry using anewdata<-data.frame(Palmitic=2) can't then be used. If you use lm(Area#Palmitic,data=olive) then the variable names are right and prediction works.
The real problem is that the error message does not indicate the problem at all:
Warning message: 'anewdata' had 1 rows but variable(s) found to have X
rows
instead of newdata you are using newdate in your predict code, verify once. and just use Coupon$estimate <- predict(model, Coupon)
It will work.
To avoid error, an important point about the new dataset is the name of independent variable. It must be the same as reported in the model. Another way is to nest the two function without creating a new dataset
model <- lm(Coupon ~ Total, data=df)
predict(model, data.frame(Total=c(79037022, 83100656, 104299800)))
Pay attention on the model. The next two commands are similar, but for predict function, the first work the second don't work.
model <- lm(Coupon ~ Total, data=df) #Ok
model <- lm(df$Coupon ~ df$Total) #Ko

Search hierarchical text in Oracle database

Table = BLOCK (Has composite unique index both the columns)
IP_ADDRESS CIDR_SIZE
========= ==========
10.10 16
15.0 16
67.7 16
18.0 8
Requirements:
Sub block is not allowed. For e.g. 67.7.1 and 24 is not allowed as this is child of 67.7. In other words, if there is any IP address in the database that matches beginning portion of new IP, then it should fail. Is it possible for me to do it using a Oracle SQL query?
I was thinking of doing it by...
Select all records into the memory.
Convert each IP into its binary bits
10.10 = 00001010.00001010
15.0 = 00001111.00000000
67.7 = 01000011.00000111
18.0 = 00010010.00000000
Convert new IP into binary bit. 67.7.1 = 01000011.00000111.00000001
Check to see if new IP binary bits start with existing IP binary bits.
If true, then the new record exists in the database.
For example, new binary bit 01000011.00000111.00000001 does start with existing ip (67.7) binary bits 01000011.00000111. Rest of records don't match.
I am looking to see if there a Oracle query that can do this for me, that is return the matching IP addresses from the database. I checked out Oracle's Text API, but didn't find anything just yet.
Is there a reason you can't use the INSTR function?
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions068.htm#i77598
I'd do something like a NOT EXISTS clause that checks for INSTR(b_outer.IP_ADDRESS,b_inner.IP_ADDRESS) <> 1
*edit: thinking about this you'd probably need to check to see if the result is 1 (meaning the potential IP address matches starting at the first character of an existing IP address) as opposed to a general substring search as I originally had it.
Yes you can do it in SQL by converting IP's to numbers and then ensureing this is not a record with a smaller cidr size that gives the same ipnum when using its cidr size.
WITH ipv AS
( SELECT IP.*
, NVL(REGEXP_SUBSTR( ip, '\d+', 1, 1 ),0) * 256 * 256 * 256 -- octet1
+ NVL(REGEXP_SUBSTR( ip, '\d+', 1, 2 ),0) * 256 * 256 -- octet2
+ NVL(REGEXP_SUBSTR( ip, '\d+', 1, 3 ),0) * 256 -- octet3
+ NVL(REGEXP_SUBSTR( ip, '\d+', 1, 4 ),0) AS ipnum -- octet4
, 32-bits AS ignorebits
FROM ips IP
)
SELECT IP1.ip, IP1.bits
FROM ipv IP1
WHERE NOT EXISTS
( SELECT 1
FROM ipv IP2
WHERE IP2.bits < IP1.bits
AND TRUNC( IP2.ipnum / POWER( 2, IP2.ignorebits ) )
= TRUNC( IP1.ipnum / POWER( 2, IP2.ignorebits ) )
)
Note: My example uses the table equivalent to yours:
SQL> desc ips
Name Null? Type
----------------------------------------- -------- ----------------------------
IP NOT NULL VARCHAR2(16)
BITS NOT NULL NUMBER

Categories

Resources