Can you retrieve index of all files when connecting to a webserver - java

is it possible to retrieve a index of all files on the web server when connecting to a website?
(Something similar to this image: http://www.linuxscrew.com/wp-content/uploads/2008/06/directory_index.png)
I understand that you can achieve similar effect using a web crawler, but there might be some unlisted links on the website, that are public, but invisible. Is there any way to access those files?

Not unless the server is configured to expose such a list. In many cases, there are very few "files", per se, in the first place. Resources are database records that are processed by server-side routines to provide HTML content to the browser.

Related

How can I connect to Alfresco documents through custom web application

Our java web application uses Alfresco as DMS. The application uses one single systemuser to connect to Alfresco. The application manages the access rights itself with some Business Logic.
Now what I'd like to accomplish, is to be able to use the MS Office URIs to do online editing of Word documents that live in Alfresco. So that's for example an URL that looks like ms-word:ofe|u|https://ourwebapp.com/documents/mydocument.docx
However if we open our documents like this, the user would end up being able to do stuff on Alfresco that we don't want them to do.
Because we want to keep our documents safe and secure, we don't want the users to be able to get the Alfresco documents "directly", but through our app. Opening Alfresco documents directly would mean that each individual user should get a unique Alfresco username/password and we don't have that and we don't want that because we already have lots and lots of documents living in Alfresco.
Surely there are other companies running into this problem? I.e. using their DMS with one single system user?
What I've already tried is to make REST endpoint. A Spring Filter ensures that an authorisation header with username/password is added and the request is forwarded to Alfresco. Then the response from Alfresco is passed back to the user. However this results in a document that's opened in read-only modus at best. Further more, it doesn't seem very secure to set up a connection with the user, using this system user credentials. For all I know, the user will be able to do stuff in Alfresco he isn't supposed to do. Like editing or even viewing other documents. A little bit like this:
There's very little documentation on how the ms-word protocol exactly works, maybe you can point me in the right direction? Or suggest some workarounds I might try out?
For this to work using sharepoint protocol (SPP) you woud have to reimplement the whole protocol server in front of your application since you control the access. There is no free or even available SPP implementation I know of you can (re)use for this.
The Alfresco protocol server may not be an option since you can't / want mirror access control from your app into alfresco. If you get access to a system like Alfresco or Sharepoint using file protocol you will get too much access rights as you already described. By following a concept of an application user you may be locked out from Alfresco concepts for end users if you can't mirror the access logic into alfresco.
Years ago we implemented a dynamic low level access voter to up- or downgrade access inside Alfresco's node service to allow specific permissions based on types and metadata. The same way someone could implement an interface to another system to delegate permission checks based on external data but this would slow down all the systems involved dramatically.
We have a similar requirement since we access documents and data from several enterprise sources including Alfresco from our own business process product having a rule and process based access concept based on cases, processes the documents are involved in- not on folders or document's static ACLs. We use a local service installed on the client partnering with the browser app for downloading, opening and saving back documents after closing the file from a local temporay (checked out) path. Our local client has no idea from Alfresco and is authenticated only against our services using JSON Web Tokens.
So my answer is more a concept not a ready to go solution in the hope to be helpful.

how to fetch a JSP page residing on a remote machine?

I'm looking for a way within Spring MVC to put my JSP pages in a remote machine and load them when I need them.
The reason I wanna do this is because my application received some page templates from users and I have to save them somewhere and load them dynamically when that page get requested! I was thinking if I want to put my users' JSPs pages inside my web-app on real time, It's not possible so I have two choice :
1) save it in a remote place and get reference to it while a request comes in
2) save them inside database which I think that's not good because the user page may have so many visitors ...
What solution you suggest ?
Using unix? Maybe you could mount the remote server and create a symbolic link to WEB-INF/jsp directory to point to the remote mount.

Java - File parsing vs fetching html over http

We have a Java class that is supposed to fetch an HTML file and then read some content in it based on the id of certain divs and then return the content to a frontend which will then render it.
Now we have a set of HTML files on a common file system somewhere on the network. Multiple applications will access it. It is like a homegrown GUI help guide for our customer facing screens with a centralized storage.
We have managed to load the html file in 2 ways
Start an Apache web server and put all html files in htdocs. The calling Java class then makes an http call http://someIP:80/helpguide/userguide.html #firstname. This will fetch the help guide related to FirstName field on the screen. The Apache service has to be managed as it is accessed in Live but only accessible within our network.
Create a Shared directory and grant access to it to the Windows logon used to run the Windows service that runs Tomcat where the client facing web application is deployed. Then the Java client class uses new File("<file location>") to load the file and read its content. This works as well.
Basically we have 2 ways to load the html file. Now we are confused whether to use route 1 or 2?
The html files won't be that massive and will be of reasonable size. It may have inline css or youtube video links embedded in them.
Downside of (2) is if we want to include images later it won't work while it should work with (1).
However in terms of performance and efficiency how are teh 2 approaches different? (1) will open a Http socket connection over port 80 and get the html stream back. WIth (2) It will possibly use a File Inputstream to get the file on the server.

Get the file list in the WebServer from the appserver

Hi all
Is there any methods to fetch the File List in the Web server from the Application Server using JAVA?
i am finding something like new file ("/webserver_context_root/folder/") method that using the relative path to get the web server's resources from the app server...
PS : The reverse proxy has been set between the web and application servers.
Any ideas?
The HTTP protocol provides no standard way to list a "folder". Indeed, the HTTP and URI / URL specs don't even recognize "folder" as a concept.
If the folder notion is meaningful for your website then there are two approaches that could work:
Many webservers can be configured to produce a listing (e.g. in HTML) for a URL that corresponds to a folder in the webservers content space. (This is usually turned off for security reasons.) You can "scrape" this HTML to extract the list of names of things in the folder.
You could implement a RESTful service to return a list of the "files" in a "folder" as (say) JSON or XML.
Note however, that both approaches will be specific to that website. They won't work for arbitrary websites. I'm also assuming that the "application server" accesses the "web server" using HTTP. If it can access it some other way, there may be other solutions.

web application design problem related to search engine optimization

We have a website which has a decent customer base. We recently started a subdomain which will be used to serve specific content from the website. We need to redirect the users to subdomain when he/she tries to access the designated content from the old domain. For example I have old domain www.marketplace.com and a subdomain www.paper.marketplace.com. The web application serving both the domains is same. So when the user tries access a URL 'www.marketplace.com\paper\viewarticle' he should be redirected to 'www.paper.marketplace.com\paper\viewarticle'. Since it's the same web application serving both the domains I wanted do this using a servlet. The servlet should redirect user to subdomain based on certain configuration. I've thought about using a properties file in each folder that has a flag that determines if the request accessing the .html/.jsp files should be redirected or not.
the .jsp/.html files can be added/removed to deployment at runtime which is also a key for choosing this design.
Please comments on this approach or suggest any other ideas if you think it's better.
Thanks.
you can do this redirect, as long as the redirect follows some sort of convention, using apache rewrite rules. This way the overhead on your server to parse request, redirect and parse request again is reduced to just parsing the CORRECT request. This will for sure improve the performance on your site by reducing the number of requests.
Rewrites can be done with other servers, not just apache. Apache is just the most documented. Read Apache Rewrite Guide for more info.

Categories

Resources