I have a requirement of reading User Information from 2 different sources (db) per userId and storing consolidated information in a Map with key as userId. Users in numbers can vary based on period they have opted for. Group of users may belong to different Period of Year.eg daily, weekly, monthly users.
I used HashMap and LinkedHashMap to get this done. As it slows down the process and to make it faster, I thought of using Threading here.
Reading some tutorials and examples now I am using ConcurrentHashMap and ExecutorService.
In cases based on some validation I want to skip the current iteration and move to next User info. It doesnot allow to use continue keyword to use within for loop. Is there any way to achieve same differently within Multithreaded code.
Moreover below code piece though it works, but its not significantly that faster than the code without threading which creates doubt if Executor Service is implemented correctly.
How do we debug in case we get any error in Multithreaded code. Execution holds at debug point but its not consistent and it does not move to next line with F6.
Can someone point out if I am missing something in the code. Or any other example of simillar use case also can be of great help.
public void getMap() throws UserException
{
long startTime = System.currentTimeMillis();
Map<String, Map<Integer, User>> map = new ConcurrentHashMap<String, Map<Integer, User>>();
//final String key = "";
try
{
final Date todayDate = new Date();
List<String> applyPeriod = db.getPeriods(todayDate);
for (String period : applyPeriod)
{
try
{
final String key = period;
List<UserTable1> eligibleUsers = db.findAllUsers(key);
Map<Integer, User> userIdMap = new ConcurrentHashMap<Integer, User>();
ExecutorService executor = Executors.newFixedThreadPool(eligibleUsers.size());
CompletionService<User> cs = new ExecutorCompletionService<User>(executor);
int userCount=0;
for (UserTable1 eligibleUser : eligibleUsers)
{
try
{
cs.submit(
new Callable<User>()
{
public User call()
{
int userId = eligibleUser.getUserId();
List<EmployeeTable2> empData = db.findByUserId(userId);
EmployeeTable2 emp = null;
if (null != empData && !empData.isEmpty())
{
emp = empData.get(0);
}else{
String errorMsg = "No record found for given User ID in emp table";
logger.error(errorMsg);
//continue;
// conitnue does not work here.
}
User user = new User();
user.setUserId(userId);
user.setFullName(emp.getFullName());
return user;
}
}
);
userCount++;
}
catch(Exception ex)
{
String errorMsg = "Error while creating map :" + ex.getMessage();
logger.error(errorMsg);
}
}
for (int i = 0; i < userCount ; i++ ) {
try {
User user = cs.take().get();
if (user != null) {
userIdMap.put(user.getUserId(), user);
}
} catch (ExecutionException e) {
} catch (InterruptedException e) {
}
}
executor.shutdown();
map.put(key, userIdMap);
}
catch(Exception ex)
{
String errorMsg = "Error while creating map :" + ex.getMessage();
logger.error(errorMsg);
}
}
}
catch(Exception ex){
String errorMsg = "Error while creating map :" + ex.getMessage();
logger.error(errorMsg);
}
logger.info("Size of Map : " + map.size());
Set<String> periods = map.keySet();
logger.info("Size of periods : " + periods.size());
for(String period :periods)
{
Map<Integer, User> mapOfuserIds = map.get(period);
Set<Integer> userIds = mapOfuserIds.keySet();
logger.info("Size of Set : " + userIds.size());
for(Integer userId : userIds){
User inf = mapOfuserIds.get(userId);
logger.info("User Id : " + inf.getUserId());
}
}
long endTime = System.currentTimeMillis();
long timeTaken = (endTime - startTime);
logger.info("All threads are completed in " + timeTaken + " milisecond");
logger.info("******END******");
}
You really don't want to create a thread pool with as many threads as users you've read from the db. That doesn't make sense most of the time because you need to keep in mind that threads need to run somewhere... There are not many servers out there with 10 or 100 or even 1000 cores reserved for your application. A much smaller value like maybe 5 is often enough, depending on your environment.
And as always for topics about performance: You first need to test what your actual bottleneck is. Your application may simply don't benefit of threading because e.g. you are reading form a db which only allows 5 concurrent connections a the same time. In that case all your other 995 threads will simply wait.
Some other thing to consider is network latency: Reading multiple user ids from multiple threads may even increase the round trip time needed to get the data for one user from the database. An alternative approach might be to not read one user at a time, but the data of all 10'000 of them at once. That way your maybe available 10 GBit Ethernet connection to your database might really speed things up because you have only small communication overhead with the database but it might serve you all data you need in one answer quickly.
So in short, from my opinion your question is about performance optimization of your problem in general, but you don't know enough yet to decide which way to go.
you could try something like that:
List<String> periods = db.getPeriods(todayDate);
Map<String, Map<Integer, User>> hm = new HashMap<>();
periods.parallelStream().forEach(s -> {
eligibleUsers = // getEligibleUsers();
hm.put(s, eligibleUsers.parallelStream().collect(
Collectors.toMap(UserTable1::getId,createUserForId(UserTable1:getId))
});
); //
And in the createUserForId you do your db-reading
private User createUserForId(Integer id){
db.findByUserId(id);
//...
User user = new User();
user.setUserId(userId);
user.setFullName(emp.getFullName());
return user;
}
Related
We created a program to make the use of the database easier in other programs. So the code im showing gets used in multiple other programs.
One of those other programs gets about 10,000 records from one of our clients and has to check if these are in our database already. If not we insert them into the database (they can also change and have to be updated then).
To make this easy we load all the entries from our whole table (at the moment 120,000), create a class for every entry we get and put all of them into a Hashmap.
The loading of the whole table this way takes around 5 minutes. Also we sometimes have to restart the program because we run into a GC overhead error because we work on limited hardware. Do you have an idea of how we can improve the performance?
Here is the code to load all entries (we have a global limit of 10.000 entries per query so we use a loop):
public Map<String, IMasterDataSet> getAllInformationObjects(ISession session) throws MasterDataException {
IQueryExpression qe;
IQueryParameter qp;
// our main SDP class
Constructor<?> constructorForSDPbaseClass = getStandardConstructor();
SimpleDateFormat itaTimestampFormat = new SimpleDateFormat("yyyyMMddHHmmssSSS");
// search in standard time range (modification date!)
Calendar cal = Calendar.getInstance();
cal.set(2010, Calendar.JANUARY, 1);
Date startDate = cal.getTime();
Date endDate = new Date();
Long startDateL = Long.parseLong(itaTimestampFormat.format(startDate));
Long endDateL = Long.parseLong(itaTimestampFormat.format(endDate));
IDescriptor modDesc = IBVRIDescriptor.ModificationDate.getDescriptor(session);
// count once before to determine initial capacities for hash map/set
IBVRIArchiveClass SDP_ARCHIVECLASS = getMasterDataPropertyBag().getSDP_ARCHIVECLASS();
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
int nrOfHitsTotal = session.getDocumentServer().queryCount(session, qp, "*");
int initialCapacity = (int) (nrOfHitsTotal / 0.75 + 1);
// MD sets; and objects already done (here: document ID)
HashSet<String> objDone = new HashSet<>(initialCapacity);
HashMap<String, IMasterDataSet> objRes = new HashMap<>(initialCapacity);
qp.close();
// do queries until hit count is smaller than 10.000
// use modification date
boolean keepGoing = true;
while(keepGoing) {
// construct query expression
// - basic part: Modification date & class type
// a. doc. class type
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
// b. ID
qe = SearchUtil.appendQueryExpressionWithANDoperator(session, qe,
new PlainExpression(modDesc.getQueryLiteral() + " BETWEEN " + startDateL + " AND " + endDateL));
// 2. Query Parameter: set database; set expression
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
// order by modification date; hitlimit = 0 -> no hitlimit, but the usual 10.000 max
qp.setOrderByExpression(session.getDocumentServer().getClassFactory().getOrderByExpressionInstance(modDesc, true));
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
// Do not sort by modification date;
qp.setHints("+NoDefaultOrderBy");
keepGoing = false;
IInformationObject[] hits = null;
IDocumentHitList hitList = null;
hitList = session.getDocumentServer().query(qp, session);
IDocument doc;
if (hitList.getTotalHitCount() > 0) {
hits = hitList.getInformationObjects();
for (IInformationObject hit : hits) {
String objID = hit.getID();
if(!objDone.contains(objID)) {
// do something with this object and the class
// here: construct a new SDP sub class object and give it back via interface
doc = (IDocument) hit;
IMasterDataSet mdSet;
try {
mdSet = (IMasterDataSet) constructorForSDPbaseClass.newInstance(session, doc);
} catch (Exception e) {
// cause for this
String cause = (e.getCause() != null) ? e.getCause().toString() : MasterDataException.ERRMSG_PART_UNKNOWN;
throw new MasterDataException(MasterDataException.ERRMSG_NOINSTANCE_POSSIBLE, this.getClass().getSimpleName(), e.toString(), cause);
}
objRes.put(mdSet.getID(), mdSet);
objDone.add(objID);
}
}
doc = (IDocument) hits[hits.length - 1];
Date lastModDate = ((IDateValue) doc.getDescriptor(modDesc).getValues()[0]).getValue();
startDateL = Long.parseLong(itaTimestampFormat.format(lastModDate));
keepGoing = (hits.length >= 10000 || hitList.isResultSetTruncated());
}
qp.close();
}
return objRes;
}
Loading 120,000 rows (and more) each time will not scale very well, and your solution may not work in the future as the record size grows. Instead let the database server handle the problem.
Your table needs to have a primary key or unique key based on the columns of the records. Iterate through the 10,000 records performing JDBC SQL update to modify all field values with where clause to exactly match primary/unique key.
update BLAH set COL1 = ?, COL2 = ? where PKCOL = ?; // ... AND PKCOL2 =? ...
This modifies an existing row or does nothing at all - and JDBC executeUpate() will return 0 or 1 indicating number of rows changed. If number of rows changed was zero you have detected a new record which does not exist, so perform insert for that new record only.
insert into BLAH (COL1, COL2, ... PKCOL) values (?,?, ..., ?);
You can decide whether to run 10,000 updates followed by however many inserts are needed, or do update+optional insert, and remember JDBC batch statements / auto-commit off may help speed things up.
I am using ConcurrentSkipListMap in my code and I want to ensure a method to be concurrent-safe.
public synchronized UserLogin getOrCreateLogin(Integer userId) {
// new user
if (!sessionCache.containsKey(userId)) {
String sessionKey = sessionKeyPool.getKey();
sessionCache.putIfAbsent(userId, sessionKey);
return userSessions.computeIfAbsent(sessionKey, userLogin -> new UserLogin(userId, sessionKey));
} else { // old user, check session
String oldSessionKey = sessionCache.get(userId);
UserLogin old = userSessions.get(oldSessionKey);
if (!old.isExpired()) return old;
log.info("Session expired! Create another. User id: " + userId + ", last login: " + old.getStartTime());
String newKey = sessionKeyPool.getKey();
UserLogin another = new UserLogin(userId, newKey);
userSessions.replace(newKey, another);
sessionCache.replace(userId, newKey);
return another;
}
}
I know that the map atomatically does containsKey(), get and so one. But does not mean the whole block cannot be accessed by two threads at the same time, but at least between line and line the happen-before relationship is guarded(the first thread reaches this line will exit this line first). But, then how to actually use these methods to have a composable and consistent result? How do I lock these lines?
How can I lock to gain the finest granularity without compromising the consistency of output data?
I'm ingesting a stream of data into Flink. For each 'instance' of this data, I have a timestamp. I can detect if the machine I'm getting the data from is 'producing' or 'not producing', this is done via a custom flat map function that's located in it's own static class.
I want to calculate how long the machine has been producing / not producing.
My current approach is collecting the production and non production timestamps in two plain lists. For each 'instance' of the data, I calculate the current production/non-production duration by subtracting the latest timestamp from the earliest timestamp. This is giving me incorrect results, though. When the production state changes from producing to non producing, I clear the timestamp list for producing and vice versa, so that if the production starts again, the duration starts from zero.
I've looked into the two lists I collect the respective timestamps in and I see things I don't understand. My assumption is that, as long as the machine 'produces', the first timestamp in the production timestamp list stays the same, while new timestamps are added to the list per new instance of data.
Apparantly, this assumption is wrong since I get seemingly random timestamps in the lists. They are still correctly ordered, though.
Here's my code for the flatmap function:
public static class ImaginePaperDataConverterRich extends RichFlatMapFunction<ImaginePaperData, String> {
private static final long serialVersionUID = 4736981447434827392L;
private transient ValueState<ProductionState> stateOfProduction;
SimpleDateFormat dateFormat = new SimpleDateFormat("dd.MM.yyyy HH:mm:ss.SS");
DateFormat timeDiffFormat = new SimpleDateFormat("dd HH:mm:ss.SS");
String timeDiffString = "00 00:00:00.000";
List<String> productionTimestamps = new ArrayList<>();
List<String> nonProductionTimestamps = new ArrayList<>();
public String calcProductionTime(List<String> timestamps) {
if (!timestamps.isEmpty()) {
try {
Date firstDate = dateFormat.parse(timestamps.get(0));
Date lastDate = dateFormat.parse(timestamps.get(timestamps.size()-1));
long timeDiff = lastDate.getTime() - firstDate.getTime();
if (timeDiff < 0) {
System.out.println("Something weird happened. Maybe EOF.");
return timeDiffString;
}
timeDiffString = String.format("%02d %02d:%02d:%02d.%02d",
TimeUnit.MILLISECONDS.toDays(timeDiff),
TimeUnit.MILLISECONDS.toHours(timeDiff) % TimeUnit.HOURS.toHours(1),
TimeUnit.MILLISECONDS.toMinutes(timeDiff) % TimeUnit.HOURS.toMinutes(1),
TimeUnit.MILLISECONDS.toSeconds(timeDiff) % TimeUnit.MINUTES.toSeconds(1),
TimeUnit.MILLISECONDS.toMillis(timeDiff) % TimeUnit.SECONDS.toMillis(1));
} catch (ParseException e) {
e.printStackTrace();
}
System.out.println("State duration: " + timeDiffString);
}
return timeDiffString;
}
#Override
public void open(Configuration config) {
ValueStateDescriptor<ProductionState> descriptor = new ValueStateDescriptor<>(
"stateOfProduction",
TypeInformation.of(new TypeHint<ProductionState>() {}),
ProductionState.NOT_PRODUCING);
stateOfProduction = getRuntimeContext().getState(descriptor);
}
#Override
public void flatMap(ImaginePaperData ImaginePaperData, Collector<String> output) throws Exception {
List<String> warnings = new ArrayList<>();
JSONObject jObject = new JSONObject();
String productionTime = "0";
String nonProductionTime = "0";
// Data analysis
if (stateOfProduction == null || stateOfProduction.value() == ProductionState.NOT_PRODUCING && ImaginePaperData.actSpeedCl > 60.0) {
stateOfProduction.update(ProductionState.PRODUCING);
} else if (stateOfProduction.value() == ProductionState.PRODUCING && ImaginePaperData.actSpeedCl < 60.0) {
stateOfProduction.update(ProductionState.NOT_PRODUCING);
}
if(stateOfProduction.value() == ProductionState.PRODUCING) {
if (!nonProductionTimestamps.isEmpty()) {
System.out.println("Production has started again, non production timestamps cleared");
nonProductionTimestamps.clear();
}
productionTimestamps.add(ImaginePaperData.timestamp);
System.out.println(productionTimestamps);
productionTime = calcProductionTime(productionTimestamps);
} else {
if(!productionTimestamps.isEmpty()) {
System.out.println("Production has stopped, production timestamps cleared");
productionTimestamps.clear();
}
nonProductionTimestamps.add(ImaginePaperData.timestamp);
warnings.add("Production has stopped.");
System.out.println(nonProductionTimestamps);
//System.out.println("Production stopped");
nonProductionTime = calcProductionTime(nonProductionTimestamps);
}
// The rest is just JSON stuff
Do I maybe have to hold these two timestamp lists in a ListState?
EDIT: Because another user asked, here is the data I'm getting.
{'szenario': 'machine01', 'timestamp': '31.10.2018 09:18:39.432069', 'data': {1: 100.0, 2: 100.0, 101: 94.0, 102: 120.0, 103: 65.0}}
The behaviour I expect is that my flink program collects the timestamps in the two lists productionTimestamps and nonProductionTimestamps. Then I want my calcProductionTime method to subtract the last timestamp in the list from the first timestamp, to get the duration between when I first detected the machine is "producing" / "not-producing" and the time it stopped "producing" / "not-producing".
I found out that the reason for the 'seemingly random' timestamps is Apache Flink's parallel execution. When the parallelism is set to > 1, the order of events isn't guaranteed anymore.
My quick fix was to set the parallelism of my program to 1, this guarantees the order of events, as far as I know.
I have been wondering if there is a way to access all the twitter followers list.
We have tried using call to the REST API via twitter4j:
public List<User> getFriendList() {
List<User> friendList = null;
try {
friendList = mTwitter.getFollowersList(mTwitter.getId(), -1);
} catch (IllegalStateException e) {
e.printStackTrace();
} catch (TwitterException e) {
e.printStackTrace();
}
return friendList;
}
But it returns only a list of 20 followers.
I tried using the same call in loop, but it cause a rate limit exception - says we are not allowed to make too many requests in a small interval of time.
Do we have a way around this?
You should definitely use getFollowersIDs. As the documentation says, this returns an array (list) of IDs objects. Note that it causes the list to be broken into pages of around 5000 IDs at a time. To begin paging provide a value of -1 as the cursor. The response from the API will include a previous_cursor and next_cursor to allow paging back and forth.
The tricky part is to handle the cursor. If you can do this, then you will not have the problem of getting only 20 followers.
The first call to getFollowersIDs will need to be given a cursor of -1. For subsequent calls, you need to update the cursor value, by getting the next cursor, as done in the while part of the loop.
long cursor =-1L;
IDs ids;
do {
ids = twitter.getFollowersIDs(cursor);
for(long userID : ids.getIDs()){
friendList.add(userID);
}
} while((cursor = ids.getNextCursor())!=0 );
Here is a very good reference:
https://github.com/yusuke/twitter4j/blob/master/twitter4j-examples/src/main/java/twitter4j/examples/friendsandfollowers/GetFriendsIDs.java
Now, if the user has more than around 75000 followers, you will have to do some waiting (see Vishal's answer).
The first 15 calls will yield you around 75000 IDs. Then you will have to sleep for 15 minutes. Then make another 15 calls, and so on till you get all the followers. This can be done using a simple Thread.sleep(time_in_milliseconds) outside the for loop.
Just Change like this and try, this is working for me
try {
Log.i("act twitter...........", "ModifiedCustomTabBarActivity.class");
// final JSONArray twitterFriendsIDsJsonArray = new JSONArray();
IDs ids = mTwitter.mTwitter.getFriendsIDs(-1);// ids
// for (long id : ids.getIDs()) {
do {
for (long id : ids.getIDs()) {
String ID = "followers ID #" + id;
String[] firstname = ID.split("#");
String first_Name = firstname[0];
String Id = firstname[1];
Log.i("split...........", first_Name + Id);
String Name = mTwitter.mTwitter.showUser(id).getName();
String screenname = mTwitter.mTwitter.showUser(id).getScreenName();
// Log.i("id.......", "followers ID #" + id);
// Log.i("Name..", mTwitter.mTwitter.showUser(id).getName());
// Log.i("Screen_Name...", mTwitter.mTwitter.showUser(id).getScreenName());
// Log.i("image...", mTwitter.mTwitter.showUser(id).getProfileImageURL());
}
} while (ids.hasNext());
} catch (Exception e) {
e.printStackTrace();
}
Try This...
ConfigurationBuilder confbuilder = new ConfigurationBuilder();
confbuilder.setOAuthAccessToken(accessToken)
.setOAuthAccessTokenSecret(secretToken)
.setOAuthConsumerKey(TwitterOAuthActivity.CONSUMER_KEY)
.setOAuthConsumerSecret(TwitterOAuthActivity.CONSUMER_SECRET);
Twitter twitter = new TwitterFactory(confbuilder.build()).getInstance();
PagableResponseList<User> followersList;
ArrayList<String> list = new ArrayList<String>();
try
{
followersList = twitter.getFollowersList(screenName, cursor);
for (int i = 0; i < followersList.size(); i++)
{
User user = followersList.get(i);
String name = user.getName();
list.add(name);
System.out.println("Name" + i + ":" + name);
}
listView.setAdapter(new ArrayAdapter<String>(this, android.R.layout.simple_list_item_1 , list));
listView.setVisibility(View.VISIBLE);
friend_list.setVisibility(View.INVISIBLE);
post_feeds.setVisibility(View.INVISIBLE);
twit.setVisibility(View.INVISIBLE);
}
This is a tricky one.
You should specify whether you're using application or per user tokens and the number of users you're fetching followers_ids for.
You get just 15 calls per 15 minutes in case of an application token. You can fetch a maximum of 5000 followers_ids per call. That gives you a maximum of 75K followers_ids per 15 minutes.
If any of the users you're fetching followers_ids for has over 75K followers, you'll get the rate_limit error immediately. If you're fetching for more than 1 user, you'll need to build strong rate_limit handling in your code with sleeps and be very patient.
The same applies for friends_ids.
I've not had to deal with fetching more than 75K followers/friends for a given user but come to think of it, I don't know if it's even possible anymore.
I have a java code that generates a request number based on the data received from database, and then updates the database for newly generated
synchronized (this.getClass()) {
counter++;
System.out.println(counter);
System.out.println("start " + System.identityHashCode(this));
certRequest
.setRequestNbr(generateRequestNumber(certInsuranceRequestAddRq
.getAccountInfo().getAccountNumberId()));
System.out.println("outside funcvtion"+certRequest.getRequestNbr());
reqId = Utils.getUniqueId();
certRequest.setRequestId(reqId);
System.out.println(reqId);
ItemIdInfo itemIdInfo = new ItemIdInfo();
itemIdInfo.setInsurerId(certRequest.getRequestId());
certRequest.setItemIdInfo(itemIdInfo);
dao.insert(certRequest);
addAccountRel();
counter++;
System.out.println(counter);
System.out.println("end");
}
the output for System.out.println() statements is `
1
start 27907101
com.csc.exceed.certificate.domain.CertRequest#a042cb
inside function request number66
outside funcvtion66
AF88172D-C8B0-4DCD-9AC6-12296EF8728D
2
end
3
start 21695531
com.csc.exceed.certificate.domain.CertRequest#f98690
inside function request number66
outside funcvtion66
F3200106-6033-4AEC-8DC3-B23FCD3CA380
4
end
In my case I get a call from two threads for this code.
If you observe both the threads run independently. However the data for request number is same in both the cases.
is it possible that before the database updation for first thread completes the second thread starts execution.
`
the code for generateRequestNumber() is as follows:
public String generateRequestNumber(String accNumber) throws Exception {
String requestNumber = null;
if (accNumber != null) {
String SQL_QUERY = "select CERTREQUEST.requestNbr from CertRequest as CERTREQUEST, "
+ "CertActObjRel as certActObjRel where certActObjRel.certificateObjkeyId=CERTREQUEST.requestId "
+ " and certActObjRel.certObjTypeCd=:certObjTypeCd "
+ " and certActObjRel.certAccountId=:accNumber ";
String[] parameterNames = { "certObjTypeCd", "accNumber" };
Object[] parameterVaues = new Object[] {
Constants.REQUEST_RELATION_CODE, accNumber };
List<?> resultSet = dao.executeNamedQuery(SQL_QUERY,
parameterNames, parameterVaues);
// List<?> resultSet = dao.retrieveTableData(SQL_QUERY);
if (resultSet != null && resultSet.size() > 0) {
requestNumber = (String) resultSet.get(0);
}
int maxRequestNumber = -1;
if (requestNumber != null && requestNumber.length() > 0) {
maxRequestNumber = maxValue(resultSet.toArray());
requestNumber = Integer.toString(maxRequestNumber + 1);
} else {
requestNumber = Integer.toString(1);
}
System.out.println("inside function request number"+requestNumber);
return requestNumber;
}
return null;
}
Databases allow multiple simultaneous connections, so unless you write your code properly you can mess up the data.
Since you only seem to require a unique growing integer, you can easily generate one safely inside the database with for example a sequence (if supported by the database). Databases not supporting sequences usually provide some other way (such as auto increment columns in MySQL).