I have 2 array list of files (consider large number of files in it (1k -5k))
This is created on fly when new files are added.
addedfiles=['temp.java', 'TEMP.java', 'DENT.java', 'Seal.java']
note: This files like temp.java and TEMP.java are same, added in case-sensitive way and are duplicates.
These files are all-ready present in system
ExistingFiles=['dent.java', 'temp1.java','comp.java']
note: They are distinct and unique from each other.
I am trying come with optimal logic to add distinct unique files from addedfiles to ExistingFiles.
So, in above only Seal.java file will be added in ExistingFiles as it is distinctly unqiue in addedfiles.
My logic:
1. create a hashmap from addedfiles like [name:count]
{temp.java:2, DENT.java:1,Seal.java:1}
2. creating nonduplicate array =[DENT.java,Seal.java]
3. comparing ExistingFiles and nonduplicate array using sort and binarysearch
if value is >=0 add value from nonduplicate to ExistingFiles.
Is there any better way to do this by using union or intersection or threads thanks:)
Assuming you don't need to retain the case of the filenames, I would store lower-cased existing files' name in a Set, says existingFiles, and do it as follows
Set<String> newFiles = new HashSet<>();
Set<String> dupFiles = new HashSet<>();
for (String filename : addedFiles) {
filename = filename.toLowerCase();
if (existingFiles.contains(filename)) { continue; }
if (newFiles.contains(filename)) {
dupFiles.add(filename);
} else {
newFiles.add(filename);
}
}
newFiles.removeAll(dupFiles);
existingFiles.addAll(newFiles);
This solution is a little memory-heavy, but if speed is critical, it works well.
Related
I have couple of xmls which needs to be compared with different set of similar xml and while comparing i need to ignore tags based on a condition, for example
personal.xml - ignore fullname
address.xml - igone zipcode
contact.xml - ignore homephone
here is the code
Diff documentDiff=DiffBuilder
.compare(actualxmlfile)
.withTest(expectedxmlfile)
.withNodeFilter(node -> !node.getNodeName().equals("FullName"))
.ignoreWhitespace()
.build();
How can i add conditions at " .withNodeFilter(node -> !node.getNodeName().equals("FullName")) " or is there a smarter way to do this
You can join multiple conditions together using "and" (&&):
private static void doDemo1(File actual, File expected) {
Diff docDiff = DiffBuilder
.compare(actual)
.withTest(expected)
.withNodeFilter(
node -> !node.getNodeName().equals("FullName")
&& !node.getNodeName().equals("ZipCode")
&& !node.getNodeName().equals("HomePhone")
)
.ignoreWhitespace()
.build();
System.out.println(docDiff.toString());
}
If you want to keep your builder tidy, you can move the node filter to a separate method:
private static void doDemo2(File actual, File expected) {
Diff docDiff = DiffBuilder
.compare(actual)
.withTest(expected)
.withNodeFilter(node -> testNode(node))
.ignoreWhitespace()
.build();
System.out.println(docDiff.toString());
}
private static boolean testNode(Node node) {
return !node.getNodeName().equals("FullName")
&& !node.getNodeName().equals("ZipCode")
&& !node.getNodeName().equals("HomePhone");
}
The risk with this is you may have element names which appear in more than one type of file - where that node needs to be filtered from one type of file, but not any others.
In this case, you would also need to take into account the type of file you are handling. For example, you can use the file names (if they follow a suitable naming convention) or use the root elements (assuming they are different) - such as <Personal>, <Address>, <Contact> - or whatever they are, in your case.
However, if you need to distinguish between XML file types, for this reason, you may be better off using that information to have separate DiffBuilder objects, with different filters. That may result in clearer code.
I had provided the separate method in the below link for !node.getNodeName().equals("FullName")(which you are using in your code), I think by using that separate method you can just pass the array of nodes which you want to ignore and see the results. And incase you wish to add any other conditions based on your requirement, you can try and play in this method.
https://stackoverflow.com/a/68099435/13451711
So I am running some code which runs over 300k times. Each time this code runs, it returns up to 300k values. I am currently storing the results I get in an ArrayList:
List<List<Object>> thisList = new ArrayList<List<Object>();
for (int i = 0; i < 300000; i++) {
thisList.add(new ArrayList<Object>());
}
for (int i = 0; i < 300000; i++) {
List<Object> result = someCode();
for (Object obj : result) {
thisList.get(obj.id).add(obj.value);
}
}
In this code, everytime obj is obtained, it has a value obj.id which specifies the index in the List where obj.value has to be stored.
What would be the most efficient way to store the results elsewhere as the search continues? My code seems to stop working past iteration 400, most likely due to low memory issues. I have considered using a simple text document where each line represents a List<Object> but through some Googling, it seems there is no way to append to a specific line, and all suggestions seems to point towards overwriting the entire text document. I've never worked with databases before which is why I am trying to avoid that for now.
Would appreciate if someone can give me suggestions on what I could do.
Edit: Is there a method which does not use a database, where after each iteration of the outer for loop, the data can be stored?
For example, given a file which currently contains
List 0: obj.value1 obj.value2
List 1: obj.value1 obj.value4
...
List 300000: obj.value3 obj.value8
and result contains
{obj<1, 100>, obj<0, 3>, ...}
where each object is of the form obj<id, value>, the file becomes
List 0: obj.value1 obj.value2 obj.value3
List 1: obj.value1 obj.value4 obj.value100
...
List 300000: obj.value3 obj.value8
You could store it in an XML file using JAXB api
Here is a link with a little tutorial on JAXB:
https://dzone.com/articles/using-jaxb-for-xml-with-java
Or you could also store it in a JSON file usin json-simple api
Here's another little tutorial:
https://stackabuse.com/reading-and-writing-json-in-java/
These are the links to download JAXB and json-simple from maven:
JAXB: https://mvnrepository.com/artifact/javax.xml.bind/jaxb-api
json-simple: https://mvnrepository.com/artifact/com.googlecode.json-simple/json-simple
Hope it'll be useful to you
I have the following list which contains a series of folder paths. Some of these are redundant so I need to remove them and the final list should only contain the bottom level folders:
Initial list:
var paths = new List<string>
{
"Pavements/",
"Pavements/2019_05/",
"Pavements/2019_06/",
"Pavements/2019_06/A/",
"Roads/",
"Roads/2019_06/"
};
The final List should look like:
paths =
{
"Pavements/2019_05/",
"Pavements/2019_06/A/",
"Roads/2019_06/"
};
i.e. all the upper level folder paths have been removed.
Does anyone know how I can achieve this? I have a feeling I need a recursive method but am unsure how to go about it. I am using C# but answer in java or something similar is ok.
Thanks.
One way to do this is with a linq query that compares each item to all the other items and returns the item only if none of the others begin with it:
paths = paths.Where(path => !paths.Any(p => p != path && p.StartsWith(path))).ToList();
protected static void attSelection_w(Instances data) throws Exception {
AttributeSelection fs = new AttributeSelection();
WrapperSubsetEval wrapper = new WrapperSubsetEval();
wrapper.buildEvaluator(data);
wrapper.setClassifier(new RandomForest());
wrapper.setFolds(10);
wrapper.setThreshold(0.001);
fs.SelectAttributes(data);
fs.setEvaluator(wrapper);
fs.setSearch(new BestFirst());
System.out.println(fs.toResultsString());
}
Above is my code for wrapper based attribute selection using random forest + bestfirst search. However, this somehow spits out a result using cfs, like below.
Search Method:
Greedy Stepwise (forwards).
Start set: no attributes
Merit of best subset found: 0.287
Attribute Subset Evaluator (supervised, Class (nominal): 9 class):
CFS Subset Evaluator
Including locally predictive attributes
There is no other code using CFS in the whole class, and I'm pretty much stuck.. I would appreciate any help. Thanks!
You just inverted the order and get the default method, the correct order is to set the parameter first, then call the selection:
//first
fs.setEvaluator(wrapper);
fs.setSearch(new BestFirst());
//then
fs.SelectAttributes(data);
Just set class Index and add this line after creating instance data
data.setClassIndex(data.numAttributes() - 1);
I checked and it worked fine.
Just to be sure I'm not reinventing the wheel, I want to see if there is some known algorithm, class, or something that can help me solve my problem. I have a huge list of URLs from an application. I'd like to feed those URLs into a tree to create a sitemap-like data structure.
It seems that something like this may have done before. However, everything I see from my searches appears to do it from xml to tree. Ideally I'd like to have answer in Java, but I'm sure I could translate it to Java myself if necessary. If I need to do it myself, I'd probablty take each URL and break them into indexes.
[root] [0] [1] [1] -file
wwe.site.com/dir1/dir2/file.html
[root] [0] [1] [1]
www.site.com/dirabc/dir2/file.html
So, I'd parse each url into offsets [0], [1], [2], … etc., and those be depth down in tree where to add them. That was at least my initial plan. I'm open to any and all suggestions!
You could define your UrlTree as nested HashMaps
public class UrlTree {
private final Map<String, UrlTree> branches = new HashMap<String, UrlTree>();
public void add(String[] tokens, int i) {
if (i >= tokens.length) {
return;
}
final String token = tokens[i];
UrlTree branch = branches.get(token);
if (branch == null) {
branch = new UrlTree();
branches.put(token, branch);
}
branch.add(tokens, i + 1);
}
...
}
You'll need to implement TreeModel in a way that reflects the hierarchy of your observed directory structure. FileTreeModel is an example, and ac.Name is a simple class that parses paths for a vintage file system. See also How to Use Trees. An instance of NetBeans Outline, illustrated here, would make a nice alternative view.