Get modified files from locally cloned Git repository using Java - java

In a cloned git repository, I want to pick only the files that are modified (i.e, files that are ready to commit or which are shown as 'modified' if I run the command 'git status'). I do not want to do it on date change comparison as files could have been modified on any day over a period of time.
I need the collection of file names with their absolute file paths.
Is there any such git utility in Java available? Or what will be the better approach?

import java.io.File;
import java.util.Set;
import org.eclipse.jgit.api.Git;
import org.eclipse.jgit.api.Status;
import org.eclipse.jgit.api.errors.GitAPIException;
public class GitModifiedFileExtractor {
public static void main(String[] args) throws IllegalStateException, GitAPIException {
Git myGitRepo = Git.init().setDirectory(new File("C:\\myClonedGitRepo")).call();
Status status = myGitRepo.status().call();
Set<String> modifiedFiles = status.getModified();
for (String modifiedFile : modifiedFiles) {
System.out.println("Modified File - " + modifiedFile);
}
}
// Similarly we can get files - added, missing, removed, untracked, etc.,
// from status object.
}

Related

Unable to read a (text)file in FileProcessing.PROCESS_CONTINUOS mode

I have a requirement to read a file continously from a specific path.
Means flink job should continously poll the specified location and read a file that will arrive at this location at certains intervals .
Example: my location on windows machine is C:/inputfiles get a file file_1.txt at 2:00PM, file_2.txt at 2:30PM, file_3.txt at 3:00PM.
I experimented it with below code .
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.io.FilePathFilter;
import org.apache.flink.api.java.io.TextInputFormat;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.FileProcessingMode;
import org.apache.flink.util.Collector;
import java.util.Arrays;
import java.util.List;
public class ContinuousFileProcessingTest {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(10);
String localFsURI = "D:\\FLink\\2021_01_01\\";
TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path(localFsURI));
format.setFilesFilter(FilePathFilter.createDefaultFilter());
DataStream<String> inputStream =
env.readFile(format, localFsURI, FileProcessingMode.PROCESS_CONTINUOUSLY, 100);
SingleOutputStreamOperator<String> soso = inputStream.map(String::toUpperCase);
soso.print();
soso.writeAsText("D:\\FLink\\completed", FileSystem.WriteMode.OVERWRITE);
env.execute("read and write");
}
}
Now to test this on flink cluster i brought flink cluster up using flink's 1.9.2 version and i was able to achieve my goal of reading file continously at some intervals.
Note: Flink's 1.9.2 version can bring up cluster on windows machine.
But now i have to upgrade flink's version from 1.9.2 to 1.12 .And we used docker to bring cluster up on 1.12 (unlike 1.9.2).
Unlike windows path i changed the file location as per docker location but the same above program in not running there.
Moreover:
Accessing file is not the problem.Means if i put the file before starting the job then this job reads these files correctly but if i add any new file at runtime then it does not read this newly added files.
Need help to find the solution.
Thanks in advance.
Try to reduce directoryScanInterval from sample code to Duration.ofSeconds(50).toMillis() and checkout StreamExecutionEnvironment.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC) mode.
For RuntimeExecutionMode referred from https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/RuntimeExecutionMode.html
Working code as below:
public class ContinuousFileProcessingTest {
private static final Logger log = LoggerFactory.getLogger(ReadSpecificFilesFlinkBatch.class);
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(10);
env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
String localFsURI = "file:///usr/test";
// create the monitoring source along with the necessary readers.
TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path(localFsURI));
log.info("format : " + format.toString());
format.setFilesFilter(FilePathFilter.createDefaultFilter());
log.info("setFilesFilter : " + FilePathFilter.createDefaultFilter().toString());
log.info("getFilesFilter : " + format.getFilePath().toString());
DataStream<String> inputStream =
env.readFile(format, localFsURI, FileProcessingMode.PROCESS_CONTINUOUSLY, Duration.ofSeconds(50).toMillis());
SingleOutputStreamOperator<String> soso = inputStream.map(String::toUpperCase);
soso.writeAsText("file:///usr/test/completed.txt", FileSystem.WriteMode.OVERWRITE);
env.execute("read and write");
}
}
This code works on docker desktop with Flink 1.12 and container file path as file:///usr/test.Note Keep parallalism as minimum 2 so that parallelly files can be processed.

How to delete java.util.prefs storage on a Mac?

I'm using the java.util.prefs package to store some information entered by users. My understanding, based on documentation and this question is that the actual (user node) preferences are stored in ~/Library/Preferences/, in a file named after the package. So far, this all checks out: Whenever I store some data in the node, a file in this directory is created and using the command line tool plutil, I can inspect it and find the stored data.
However: When I delete the file, and restart my program, the data is still there. I couldn't find anything about that in the documentation or source code. Any help appreciated.
The following code demonstrates the behaviour, see command line session below:
package de.unistuttgart.ims.PreferencesTest;
import java.io.IOException;
import java.util.prefs.Preferences;
public class Main {
Preferences preferences = Preferences.userNodeForPackage(Main.class);
static Main app;
static String KEY = "KEY";
static String DEFAULTVALUE = "DEFAULTVALUE";
public static void main(String[] args) throws IOException {
app = new Main();
app.doStuff();
}
public void doStuff() throws IOException {
System.err.println("Retrieving value:");
System.err.println(preferences.get(KEY, DEFAULTVALUE));
System.err.println("Setting value:");
char ch = (char) System.in.read();
preferences.put(KEY, String.valueOf(ch));
}
}
Command line session:
$ java de.unistuttgart.ims.PreferencesTest.Main
Retrieving value:
DEFAULTVALUE
Setting value:
5
$ rm ~/Library/Preferences/de.unistuttgart.ims.plist
$ java de.unistuttgart.ims.PreferencesTest.Main
Retrieving value:
5
Setting value:
4
How can this be? Or: Where else are preferences stored?

How to rewrite one specific line of a text file in java

The image below shows the format of my settings file for a web bot I'm developing. If you look at line 31 in the image you will see it says chromeVersion. This is so the program knows which version of the chromedriver to use. If the user enters an invalid response or leaves the field blank the program will detect that and determine the version itself and save the version it determines to a string called chromeVersion. After this is done I want to replace line 31 of that file with
"(31) chromeVersion(76/77/78), if you don't know this field will be filled automatically upon the first run of the bot): " + chromeVersion
To be clear I do not want to rewrite the whole file I just want to either change the value assigned to chromeVersion in the text file or rewrite that line with the version included.
Any suggestions or ways to do this would be much appreciated.
image
You will need to rewrite the whole file, except the byte length of the file remains the same after your modification. Since this is not guaranteed to be the case or to find out is too cumbersome here is a simple procedure:
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
public class Lab1 {
public static void main(String[] args) {
String chromVersion = "myChromeVersion";
try {
Path path = Paths.get("C:\\whatever\\path\\toYourFile.txt");
List<String> lines = Files.readAllLines(path, StandardCharsets.UTF_8);
int lineToModify = 31;
lines.set(lineToModify, lines.get(lineToModify)+ chromVersion);
Files.write(path, lines, StandardCharsets.UTF_8);
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
Note that this is not the best way to go for very large files. But for the small file you have it is not an issue.

Category Tree extraction in Wikipedia using Java

Basically I intend to extract the entire category tree in Wikipedia under the root node "Economics" using Wikipedia API sandbox. I don't need the content of the articles, I just need few basic details like pageid, title, revision history (at some later stage of my work). As of now I can extract it level by level but what I want is a recursive/iterative function which does it.
Each category contains a categories and articles (like each root contains nodes and leaves).
I wrote one code to extract the first level into files. one file contains the articles, second folder contains the name of categories (daughters of the root which can be further sub-classified).
Then I went into level and extracted their categories and articles and sub-categories using similar code.
The code remains similar in each case but its the scalability. I need to reach the lowest leaves of all nodes. So i need a recursion which continuously checks till the end.
I labelled files which contains categories as 'c_', so I can provide the condition while extracting different levels.
Now for some reason it has entered into a deadlock and keeps adding same things again and again. I need a way out of the deadlock.
package wikiCrawl;
import java.awt.List;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Scanner;
import org.apache.commons.io.FileUtils;
import org.json.CDL;
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONObject;
public class SubCrawl
{
public static void main(String[] args) throws IOException, InterruptedException, JSONException
{ File file = new File("C:/Users/User/Desktop/Root/Economics_2.txt");
crawlfile(file);
}
public static void crawlfile(File food) throws JSONException, IOException ,InterruptedException
{
ArrayList<String> cat_list =new ArrayList <String>();
Scanner scanner_cat = new Scanner(food);
scanner_cat.useDelimiter("\n");
while (scanner_cat.hasNext())
{
String scan_n = scanner_cat.next();
if(scan_n.indexOf(":")>-1)
cat_list.add(scan_n.substring(scan_n.indexOf(":")+1));
}
System.out.println(cat_list);
//get the categories in different languages
URL category_json;
for (int i_cat=0; i_cat<cat_list.size();i_cat++)
{
category_json = new URL("https://en.wikipedia.org/w/api.php?action=query&format=json&list=categorymembers&cmtitle=Category%3A"+cat_list.get(i_cat).replaceAll(" ", "%20").trim()+"&cmlimit=500"); //.trim() removes trailing and following whitespaces
System.out.println(category_json);
HttpURLConnection urlConnection = (HttpURLConnection) category_json.openConnection(); //Opens the connection to the URL so clients can communicate with the resources.
BufferedReader reader = new BufferedReader (new InputStreamReader(category_json.openStream()));
String line;
String diff = "";
while ((line = reader.readLine()) != null)
{
System.out.println(line);
diff=diff+line;
}
urlConnection.disconnect();
reader.close();
JSONArray jsonarray_cat = new JSONArray (diff.substring(diff.indexOf("[{\"pageid\"")));
System.out.println(jsonarray_cat);
//Loop categories
for (int i_url = 0; i_url<jsonarray_cat.length();i_url++) //jSONarray is an array of json objects, we are looping through each object
{
//Get the URL _part (Categorie isn't correct)
int pageid=Integer.parseInt(jsonarray_cat.getJSONObject(i_url).getString("pageid")); //this can be written in a much better way
System.out.println(pageid);
String title=jsonarray_cat.getJSONObject(i_url).getString("title");
System.out.println(title);
File food_year= new File("C:/Users/User/Desktop/Root/"+cat_list.get(i_cat).replaceAll(" ", "_").trim()+".txt");
File food_year2= new File("C:/Users/User/Desktop/Root/c_"+cat_list.get(i_cat).replaceAll(" ", "_").trim()+".txt");
food_year.createNewFile();
food_year2.createNewFile();
BufferedWriter writer = new BufferedWriter (new OutputStreamWriter(new FileOutputStream(food_year, true)));
BufferedWriter writer2 = new BufferedWriter (new OutputStreamWriter(new FileOutputStream(food_year2, true)));
if (title.contains("Category:"))
{
writer2.write(pageid+";"+title);
writer2.newLine();
writer2.flush();
crawlfile(food_year2);
}
else
{
writer.write(pageid+";"+title);
writer.newLine();
writer.flush();
}
}
}
}
}
For starters this might be too big a demand on the wikimedia servers. There are over a million categories (1) and you need to read Wikipedia:Database download - Why not just retrieve data from wikipedia.org at runtime. You would need to throttle your uses to about 1 per second or risk getting blocked. This means it would take about 11 days to get the full tree.
It would be much better to use the standard dumps at https://dumps.wikimedia.org/enwiki/ these will be easier to read and process and you don't need to put a big load on the server.
Still better is to get a Wikimedia Labs account, which allow you to run queries on a replication of the database servers or scripts on the dumps without having to download some very big files.
To get just the economics categories then its easiest to go via https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Economics this has 1242 categories. You may find it easier to use the list of categories there and build the tree from there.
This will be better than a recursive approach. The problem with the wikipedia category system is that it is not really a tree, with plenty of loops. I would not be surprised if you keep following categories you will end up getting the most of wikipedia.

'Un'-externalize strings from Eclipse or Intellij

I have a bunch of strings in a properties file which i want to 'un-externalize', ie inline into my code.
I see that both Eclipse and Intellij have great support to 'externalize' strings from within code, however do any of them support inlining strings from a properties file back into code?
For example if I have code like -
My.java
System.out.println(myResourceBundle.getString("key"));
My.properties
key=a whole bunch of text
I want my java code to be replaced as -
My.java
System.out.println("a whole bunch of text");
I wrote a simple java program that you can use to do this.
Dexternalize.java
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Map.Entry;
import java.util.Properties;
import java.util.Set;
import java.util.Stack;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Deexternalize {
public static final Logger logger = Logger.getLogger(Deexternalize.class.toString());
public static void main(String[] args) throws IOException {
if(args.length != 2) {
System.out.println("Deexternalize props_file java_file_to_create");
return;
}
Properties defaultProps = new Properties();
FileInputStream in = new FileInputStream(args[0]);
defaultProps.load(in);
in.close();
File javaFile = new File(args[1]);
List<String> data = process(defaultProps,javaFile);
buildFile(javaFile,data);
}
public static List<String> process(Properties propsFile, File javaFile) {
List<String> data = new ArrayList<String>();
Set<Entry<Object,Object>> setOfProps = propsFile.entrySet();
int indexOf = javaFile.getName().indexOf(".");
String javaClassName = javaFile.getName().substring(0,indexOf);
data.add("public class " + javaClassName + " {\n");
StringBuilder sb = null;
// for some reason it's adding them in reverse order so putting htem on a stack
Stack<String> aStack = new Stack<String>();
for(Entry<Object,Object> anEntry : setOfProps) {
sb = new StringBuilder("\tpublic static final String ");
sb.append(anEntry.getKey().toString());
sb.append(" = \"");
sb.append(anEntry.getValue().toString());
sb.append("\";\n");
aStack.push(sb.toString());
}
while(!aStack.empty()) {
data.add(aStack.pop());
}
if(sb != null) {
data.add("}");
}
return data;
}
public static final void buildFile(File fileToBuild, List<String> lines) {
BufferedWriter theWriter = null;
try {
// Check to make sure if the file exists already.
if(!fileToBuild.exists()) {
fileToBuild.createNewFile();
}
theWriter = new BufferedWriter(new FileWriter(fileToBuild));
// Write the lines to the file.
for(String theLine : lines) {
// DO NOT ADD windows carriage return.
if(theLine.endsWith("\r\n")){
theWriter.write(theLine.substring(0, theLine.length()-2));
theWriter.write("\n");
} else if(theLine.endsWith("\n")) {
// This case is UNIX format already since we checked for
// the carriage return already.
theWriter.write(theLine);
} else {
theWriter.write(theLine);
theWriter.write("\n");
}
}
} catch(IOException ex) {
logger.log(Level.SEVERE, null, ex);
} finally {
try {
theWriter.close();
} catch(IOException ex) {
logger.log(Level.SEVERE, null, ex);
}
}
}
}
Basically, all you need to do is call this java program with the location of the property file and the name of the java file you want to create that will contain the properties.
For instance this property file:
test.properties
TEST_1=test test test
TEST_2=test 2456
TEST_3=123456
will become:
java_test.java
public class java_test {
public static final String TEST_1 = "test test test";
public static final String TEST_2 = "test 2456";
public static final String TEST_3 = "123456";
}
Hope this is what you need!
EDIT:
I understand what you requested now. You can use my code to do what you want if you sprinkle a bit of regex magic. Lets say you have the java_test file from above. Copy the inlined properties into the file you want to replace the myResourceBundle code with.
For example,
TestFile.java
public class TestFile {
public static final String TEST_1 = "test test test";
public static final String TEST_2 = "test 2456";
public static final String TEST_3 = "123456";
public static void regexTest() {
System.out.println(myResourceBundle.getString("TEST_1"));
System.out.println(myResourceBundle.getString("TEST_1"));
System.out.println(myResourceBundle.getString("TEST_3"));
}
}
Ok, now if you are using eclipse (any modern IDE should be able to do this) go to the Edit Menu -> Find/Replace. In the window, you should see a "Regular Expressions" checkbox, check that. Now input the following text into the Find text area:
myResourceBundle\.getString\(\"(.+)\"\)
And the back reference
\1
into the replace.
Now click "Replace all" and voila! The code should have been inlined to your needs.
Now TestFile.java will become:
TestFile.java
public class TestFile {
public static final String TEST_1 = "test test test";
public static final String TEST_2 = "test 2456";
public static final String TEST_3 = "123456";
public static void regexTest() {
System.out.println(TEST_1);
System.out.println(TEST_1);
System.out.println(TEST_3);
}
}
You may use Eclipse "Externalize Strings" widget. It can also be used for un-externalization. Select required string(s) and press "Internalize" button. If the string was externalized before, it'll be put back and removed from messages.properties file.
May be if you can explain on how you need to do this, then you could get the correct answer.
The Short answer to your question is no, especially in Intellij (I do not know enough about eclipse). Of course the slightly longer but still not very useful answer is to write a plugin. ( That will take a list of property files and read the key and values in a map and then does a regular expression replace of ResourceBundle.getValue("Key") with the value from Map (for the key). I will write this plugin myself, if you can convince me that, there are more people like you, who have this requirement.)
The more elaborate answer is this.
1_ First I will re-factor all the code that performs property file reading to a single class (or module called PropertyFileReader).
2_ I will create a property file reader module, that iterates across all the keys in property file(s) and then stores those information in a map.
4_ I can either create a static map objects with the populated values or create a constant class out of it. Then I will replace the logic in the property file reader module to use a get on the map or static class rather than the property file reading.
5_ Once I am sure that the application performs ok.(By checking if all my Unit Testing passes), then I will remove my property files.
Note: If you are using spring, then there is a easy way to split out all property key-value pairs from a list of property files. Let me know if you use spring.
I would recommend something else: split externalized strings into localizable and non-localizable properties files. It would be probably easier to move some strings to another file than moving it back to source code (which will hurt maintainability by the way).
Of course you can write simple (to some extent) Perl (or whatever) script which will search for calls to resource bundles and introduce constant in this place...
In other words, I haven't heard about de-externalizing mechanism, you need to do it by hand (or write some automated script yourself).
An awesome oneliner from #potong sed 's|^\([^=]*\)=\(.*\)|s#Messages.getString("\1")#"\2"#g|;s/\\/\\\\/g' messages.properties |
sed -i -f - *.java run this inside your src dir, and see the magic.

Categories

Resources