Thursday, March 22, 2007

Creating WebDav Access Point


The Story Begins
Apache Slide WCK
Do It Yourself
WebDav Servlet and Commands
Materialized URI
URI-To-Resource Cache
Handle WebDAV Requests
Construct DAV Response
Debug and Troubleshoot
Lessons Learned
Security
The Story Ends

The Story Begins

Have you ever wanted to let the users of your intranet web-based application work with the documents right from the Windows Explorer so that no one complains on weird web-based interface that only allows working with one document at a time? Usually you either end up with a portal-based (e.g. SharePoint) solution that provides WebDAV access capabilities out of the box, give up and work with SVN directly :) or go for custom WebDAV layer implementation. The latter is exactly what I did recently so let me share my experience with you.

WebDAV is a pretty basic thing (if we don’t consider 2.0 and other extensions like Delta-V). It’s XML-based protocol that has been out there for about the same time as SOAP has. In a nutshell it’s a standardized way for your documents repository (whatever it might be) to let WebDAV-aware clients work with the content (i.e. files and folders) over HTTP and do CRUD plus locks. WebDAV client in its turn knows how to submit requests to the repository and handle response. It’s implemented as a number of HTTP 1.1 extensions: new methods (like COPY and LOCK), new response codes (the best knowт is 207 Multi-status), and DTD-driven XML request and response bodies.

In the company I work for we have a home-grown project management center which is a web-based app to support project teams throughout whole project lifecycle. There are plenty of other tools on the market to do the same things, but, you know, if you want a tool to do exactly what you want and the way you want you’d better have it created yourself. The tool is great but it was missing one feature that I personally wanted there from the first day of knowing it – natural and fast way of dealing with documents. Every development team needs fresh specs snapshot in the morning. I would want to just go there, hit the button, and have all the content on my local machine. As a BA I would want to go there, hit the button, and have my local spec snapshot end up there under version control. Instead you would go on a web and do Open-CheckOut-Update or Click-Save one by one for all items you are interested in.

Until 2005 the tool didn’t have an open API so the only option I had was to do sort of screen-scraping. I missed the feature so much so I was ready to start this HTML-parsing journey. You can imagine it: you take jTidy, various XML tools, and hope they never change the layout; or you take HTTPUnit, JWebUnit (you name it) and hope they never change the layout. And every time they do change the layout, you find out your great tool stops working, and you go and figure what exactly has changes and hope you still find a place to hold on to to keep your thing working. I was lucky in a way that I didn’t find enough free time to explore this path :) BUT. At the very instant moment I heard they released new major version which had EJB and WS API available I decided it’s Time. So I began.

Apache Slide WCK

I googled for open source that would provide something around WebDAV. There are just so much open source initiatives, so many great minds contributing to the community, so I thought somebody with the pain-in-ass like mine would have already done something about it and released the result to the public domain. I ran into Apache Slide and theirs WCK (WebDAV Construction Kit) which sounded as an exact match. "The WebDAV Construction Kit (WCK) is a framework for easy integration of the WebDAV interface into all kinds of Java software. No special knowledge of Slide or WebDAV is required to make the usual Windows, Mac and Linux clients work with your server system" WCK has never been released and has only been available in CVS but I didn’t bother. I downloaded the code, flicked through the tutorials available and made my own implementation of BasicWebdavStore.

With WCK you only need to implement one thing (doesn’t it sound cool?). You sort of implement a very simple file system. All WebDAV specific is hidden. Custom authentication can be plugged in too. It has only been tested and proven to work with Tomcat 5.0.28, but I didn’t think I wouldn’t be able to overcome the issues (if any) having all the source code. By the end of day I had it working. I thought I’m 2 days away from my first internal release….

It never happened with WCK. After my first steps I felt like I was missing the things that were hidden from me without an ability to easily unhide and get access to. I experienced some weird issue with double requests of the same URL to my repository but due to no control over the WebDAV processing and the flow underneath I was unable to figure it out. I realized I cannot control WebDAV properties in the response (WCK interface requires you to return a list of your repository objects and specify whether it’s a folder or a file but no way to go further). At some point of time I needed to comprehend the way it builds the XML response and was a little bit shocked by the way it does it:
// taken from org.apache.slide.webdav.method.PropFindCommand
resp.getWriter().write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
resp.getWriter().write("\n");
buffer.append("<");
buffer.append(multistatusElement.getQualifiedName());
if (namespaceUri != null) {
buffer.append(" xmlns");
if (namespacePrefix != null) {
buffer.append(":");
buffer.append(namespacePrefix);
}
buffer.append("=\"");
buffer.append(namespaceUri);
buffer.append("\"");
}
buffer.append(">");
resp.getWriter().write(buffer.toString());
resp.getWriter().write("\n");


The only chance to comprehend the response was HTTP sniffer tool. By the way, try out HTTPLook, it’s good.

Anyway. This post is not about WCK so I won’t go into more details. May be it was not mature enough by then (I am not sure it’s even alive now), may be I was missing some very important concept and didn’t have enough patience to figure it out… who knows. I threw my work out and got to the next round. I decided to Do It Myself.

Do It Yourself

Do-It-Yourself is always fun. Technical people always enjoy inventing things :) Let me tell you what it takes to build your custom WebDAV connector and how you would do it.

Assuming your document repository has some kind of API to integrate with, you have to implement WebDAV HTTP request processor and stuff it with your-repository-specific logic that would handle particular DAV commands and orchestrate calls to your repository API according to what you mean to do when you receive particular command.

WebDAV Servlet and Commands

Simple picture below illustrates the idea:




1. You create new DocumentsWebDavServlet and map it in your web.xml to /* URLs assuming the web app we are building won’t handle anything but DAV requests to your repository.

2. you define basic WebdavCommand interface with the two methods:
public void init(HttpServletRequest request, HttpServletResponse response);

public void process(HttpServletRequest request, HttpServletResponse response);

3. you implement all WebDav commands you want to provide support for and make your servlet pick the right command implementation based on HTTP method requested. This may happen in the service() method similar to the code line below:
WebdavCommandsFactory.createCommandInstance(request.getMethod())


I suggest you implement CommandNotSupported command that always responds with 405 and return its implementation by default if no command implementation matching HTTP method requested found.

4. you execute the command instance and let it manipulate with your repository and write the response back to the client

So far everything seems to be straightforward. The first issue you would face with is that the only thing to hold on to when processing the particular request is the request URI.

Materialized URI

Every WebDav resource is supposed to be identified by a unique URL. Everything that an incoming request tells you is the resource URL, the command to perform on it, and number of headers with additional info (e.g. depth header of PROPFIND request that controls how deep your application shall explore the resources tree starting from the collection URL given). Given the URL you may want to ask it about what resource it points to in Your Repository Language; whether it’s a folder what’s called "collection URL" in WebDav glossary or a file; etc. And the only thing to hold on to is the string URI.

I came up with a concept of Materialized URI that is an object representation of the request URI string with lots of useful information mined out of it. Take a look at the class definition and a small code snippet (description follows) below :





public class MaterializedUri implements Serializable {
...
public boolean init() {
boolean isCacheOutOfSynch = false;

// a buffer to re-build URI as it's being tokenized.
final StringBuffer uriPath = new StringBuffer(requestUrl.length());
final StringTokenizer uriTokenizer =
new StringTokenizer(requestUrl, "/", false);

try {
// 1. build root and add application context path
uriPath.append(UriUtils.URI_SEPARATOR);
uriPath.append(uriTokenizer.nextToken());

//2. identify current repository instance
final String repositoryInstanceName = uriTokenizer.nextToken();
uriPath.append(UriUtils.URI_SEPARATOR);
uriPath.append(repositoryInstanceName);

if (LOGGER.isDebugEnabled()) {
...
}
...
//3. identify context project
final String ctxProjectName = uriTokenizer.nextToken();
uriPath.append(UriUtils.URI_SEPARATOR);
uriPath.append(ctxProjectName);

if (LOGGER.isDebugEnabled()) {
...
}

// assume project has already been read as a child of its parent
// and therefore is supposed to be cached already
final RepositoryResource ctxProject =
UriToResourceCache.getInstance().getCachedResource(uriPath.toString());

if (null == ctxProject) {
// see UriToResourceCache.forceCacheRead() for more details
isCacheOutOfSynch = true;

} else {
this.contextProject = ctxProject;
...
}
...

} catch (NoSuchElementException e) {
// tokenizing stops when no more token found.
} finally {
// whatever current resource is, it is expected to be once read
// before as a child of its parent element
// this is not true if client WebDav used cached data instead of
// re-querying content. UriToResourceCache.forceCacheRead() is
// designed to force cache update for such cases

this.associatedResource =
UriToResourceCache.getInstance().getCachedResource(requestUrl);
...
}

isInitialized = true;
return isCacheOutOfSynch;
}


The URL can tell us what repository instance we are connected to, what project we’re browsing, whether it’s a root URL or not, etc. As you can see we know what to expect from the URI, it’s a white box, if you will: repository instance code comes firs, project code comes next, etc. The reason we can do this is that it’s us, our application, who builds the URLs (!) that we later receive in requests. We’ll explore PROPFIND request semantic in a while but you can imagine the client asking for “/” and your system responding with a list of child-nodes resources with their URLs. And the subsequent request to the particular element from the list you just responded with.

When materializing the URI we consult with a UriToResourceCache to find an object that matches the URI part

URI-To-Resource Cache

UriToResourceCache is another key concept that follows materialized URI. The idea is simple. You don’t want your repository be queried over and over again to “materialize” every incoming URI. Think about the way you work with the folders structure in Explorer… you always read the parent element first. You can’t go directly to the file on the 5-th nesting level unless you visited the 4-th level and get the files listing. Bingo! It means you always read the structure first and you can memorize it so that when you request “/folder/subfolder” you can easily assume that your “/folder” part has been read before and already materialized.

What if not? What if you ended up with null returned from your cache for /folder when trying to process /folder/subfolder? Your user might have gone directly to the subfolder URL, a perfectly valid scenario. If that’s the case you need to force your tool to read the repository over for this particular URL and make sure all parent items are memorized in the cache. And if you are still getting null after force-read exercise complete that means you have to report an exception ?. Either the URL requested is not correct or something changed in your repository in between user’s requests. Look at the code snippets below to comprehend the idea:




/**
* Returns cached RepositoryResource object by request uri.
*
* @param uri request uri
*
* @return cached uri object or <code>null</code> if not found
*/
public RepositoryResource getCachedResource(final String uri) {
final RepositoryResource item =
(RepositoryResource)uriMappingsCache.get(uri);
return item;
}

...

/**
* Caches mapping between request uri and Repository resource.
*
* @param uri request uri
* @param resource Repository resource to be cached
*/
public void registerMapping(final String uri,
final RepositoryResource resource) {
if (null != resource) {
resource.setUri(uri);
uriMappingsCache.put(uri, resource);
}
}

...

/**
* Ensures URIs cache is up to date.
*
* Every Repository resource when read is cached and associated with the URI.
* The system assumes that PROPDINF request always follow other requests such
* as GET, HEAD, OPTIONS, etc. This assumption is only correct when client
* doesn't cache content read and the system never goes down.
* In the situation when a system was restarted or a client caches content
* read it may happen that a system will receive a requets to a URL that was
* not read before by a PROPFIND command
*
* This method will recursively check the URI path and pre-load appropriate
* elements into cache emulating PROPFIND requests.
*
* @param uriToRead URI being processed
*
* @throws ServiceException if reported by Repository API
* @throws ...
*/
public void forceCacheRead(final String uriToRead)
throws
ServiceException, UrlDoesNotExistException {
LOGGER.info("Force cache read for [" + uriToRead + "]");

final StringBuffer uriBuffer = new StringBuffer();
if (!isEmpty()) {
// Traverse uri from the end to the beginning to identify a
// non-existing part only
uriBuffer.append(uriToRead);

RepositoryResource associatedItem = null;
do {
associatedItem =
UriToResourceCache.getInstance()
.
getCachedResource(uriBuffer.toString());

//need to delete last part of the URL
//it will be added on the first round of tokenizing the missng part
uriBuffer.delete(uriBuffer.lastIndexOf(UriUtils.URI_SEPARATOR),
uriBuffer.length());

} while ((null == associatedItem) && (uriBuffer.length() > 0));

}

// now uriBuffer contains existing part or is empty if nothing was
// ever cached we will gradually add to the existing part what's missing
// and read it step by step. uriToTokenize is "what's missing"
final String uriToTokenize =
(uriBuffer.length() > 0) ? uriToRead.substring(uriBuffer.length())
:
uriToRead;

final StringTokenizer tokenizer =
new StringTokenizer(uriToTokenize, UriUtils.URI_SEPARATOR);

if (uriBuffer.length() == 0) {
//add very root and skip it as never cached
uriBuffer.append(UriUtils.URI_SEPARATOR);
uriBuffer.append(tokenizer.nextToken());
}

while (tokenizer.hasMoreTokens()) {
uriBuffer.append(UriUtils.URI_SEPARATOR);
uriBuffer.append(tokenizer.nextToken());

final MaterializedUri uriPart =
new MaterializedUri(uriBuffer.toString());
uriPart.init();

// resource will be read and cached
final DocumentsReader reader = new DocumentsReader();
final List result = reader.readUriContent(uriPart);

LOGGER.info("...");
}
}


The "resource" part in the “Uri-To-Resource” is just a lightweight POJO (number of them, actually, all implementing RepositoryResource interface) that I constructed to represent my repository’s objects.

Handle WebDAV requests

With the concept of materialized URI and URI-To-Resource Cache the first thing we do with the WebDAV request when it comes in is materializing the URI and storing it into the request context. ThreadLocal works ok if you are not going to involve any kind of out-of-process calls that would require you to pass it to the method calls explicitly.

Usually it’s enough to know requested DAV method, URL, and the headers. This information is available in the HTTPRequest. Sometimes you may also need XML request body to figure out requested attributes (for PROPFIND method), check lock token (for LOCK/UNLOCK), etc. Not sure there’s a need to invent anything here so I just parsed it into DOM when needed. Like this:
final Document requestBody = RequestUtils.parseRequestBody(request);


I also dump it to the log files using identity transformation. You may want to consider SAX parsing or any XML-to-Java binding technique instead. That’s pretty much it. The rest is to do what requested command mandates us to do and send the response back to the client.

Basically you would implement the following commands to begin with:

  • HEAD. Just report 200 if the resource identified by the URL exists or report 404 otherwise.

  • OPTIONS. Report 200 for the correct URI and provide number of WebDav headers that would indicate to the caller what WebDav features your tool support:


    response.setHeader("DAV", "1,2");
    response.setHeader("MS-Author-Via", "DAV");
    response.setHeader("Allow","OPTIONS,GET,... ");


  • PROPFIND. The most verbose command as you would need to send the 207 status with the directory listing for the given URL. We will explore this command in a short while.

  • GET. Respond with 200 for file type resources and write binary content into the output stream or respond with 500 or 405 for folders. Explorer should never request GET for folder type resources

  • PUT. Expect binary content to come in for the URL to either create a new file resource or update the existing one. The opposite of GET. Perform the update and respond with 200.

  • MOVE. You will need it. I will tell you why a little bit later. You can limit your implementation to a move without overwrite and to existing parent folder only. Use your repository API to rename your folders and files and change their parent location if needed. This

  • DELETE. I implemented for empty folders only. I didn’t want the users to be able to delete recursively. If they really want to delete this way they can go on the web and do it there. You can’t build any extra confirmation messages and custom dialogs for this kind of applications unless you build custom Explorer plug-in (e.g. TortoiseSVN).

  • MKCOL. This is a “create folder” request. PUT for folders.

  • LOCK / UNLOCK. If your repository supports version control you better implement these methods. MS Office tools always send LOCK command to your repository before sending GET. I would also recommend you implicitly LOCK for update operations and make sure you force version control feature (if such is available).


That’s it. Once you have these commands implemented you can do basic manipulations with your repository. You can even do bulk copies, bulk reads, and bulk updates ? Explorer will do PROPFIND to explore the tree structure and send individual GETs, PUTs, and MKCOLs for every element in your “bulk” as required.

You can also consult with WebDav reference if you want to know what response code you shall send back to the client in what circumstances (http://www.webdav.org/specs/rfc2518.html)

Construct DAV Response

Let’s explore the PROPFIND command as the most verbose, meaning requiring the most effort to actually respond to the user. Other commands either assume status code response, binary content (GET), or just a very basic XML body. PROPFIND is different. First of all the request can be as simple as just URL and depth 0:


PROPFIND /easydoc HTTP/1.1
Content-Language: en-us
Accept-Language: en-us, ru;q=0.5
Content-Type: text/xml
Translate: f
Depth: 0
Content-Length: 0
User-Agent: Microsoft Data Access Internet Publishing Provider DAV
Host: localhost:8080
Connection: Keep-Alive


Or it can be much more specific:


PROPFIND /easydoc HTTP/1.1
Content-Language: en-us
Accept-Language: en-us, ru;q=0.5
Content-Type: text/xml
Translate: f
Depth: 1
Content-Length: 489
User-Agent: Microsoft Data Access Internet Publishing Provider DAV
Host: localhost:8080
Connection: Keep-Alive
Cookie: JSESSIONID=2263FAB024BA429E24F8DB9C0CDF1F7D
Authorization: Basic UGF2ZWxfVmVsbGVyOjxFcGFtMjAwNi8+

<?xml version="1.0" encoding="UTF-8" ?>
<a:propfind xmlns:a="DAV:" xmlns:b="urn:schemas-microsoft-com:datatypes">
<a:prop>
<a:name/>
<a:parentname/>
<a:href/>
<a:ishidden/>
<a:isreadonly/>
<a:getcontenttype/>
<a:contentclass/>
<a:getcontentlanguage/>
<a:creationdate/>
<a:lastaccessed/>
<a:getlastmodified/>
<a:getcontentlength/>
<a:iscollection/>
<a:isstructureddocument/>
<a:defaultdocument/>
<a:displayname/>
<a:isroot/>
<a:resourcetype/>
</a:prop>
</a:propfind>


Even though it looks complex it basically tells you either to list all properties of the element identified by the URL (depth set to 0) or also add listing of all elements underneath (depth set to 1 or infinity). I would suggest you don’t implement infinity. My implementation for example would through CommandAbortedException and report 405 (not supported) back to the user in case infinity depth requested. I never saw Explorer looking for infinity depth. And I don’t want my tool stuck reading the repository recursively down to all leaf elements from the current one ?

The second part would be to filter out attributes requested by the client and put together appropriate multi-status (207 response code) response. And this is where I would suggest to:

  • Template your PROPFIND response as XML document

  • query the list of the repository objects with their properties as a Collection of POJOs

  • Convert the collection of objects to some XML representation. I used XStream (http://xstream.codehaus.org/) because it’s as simple in use as the Java-to-XML conversion tool could be. The only thing to notice about XStream is that it “optimizes” the resulting XML in the way not to repeat the same object twice. So be ready to design for idrefs

  • Adjust your PROPFIND response template to become XSLT stylesheet that consumes two XML documents to build the actual response. The two documents you would need are: original request body (need to filter out properties) and result list of objects to be returned to the client



Here you can see how your CommandPROPFIND may look like:


/**
* Process PROPFIND request.
*
* @param request request being processed
* @param response response to be created
*
* @throws ServletException if caused by request/response operations
* @throws IOException if caused by request/response operations
* @throws ServiceException if reported by Repository API
* @throws CommandAbortedException ToDo: DOCUMENT ME!
*/
public void process(HttpServletRequest request, HttpServletResponse response)
throws
ServletException, IOException,
ServiceException, CommandAbortedException {
String depth = request.getHeader("Depth");
if (null == depth) {
depth = DEPTH_INFINITY;
}

// WebDav resources on current level
final List listing = new ArrayList();

// resource associated with the url being processed
listing.add(uri.getRepositoryResource());

if (DEPTH_0.equals(depth)) {
// no action for depth 0.
// current resource already added to the response stack

} else if (DEPTH_1.equals(depth)) {
// read the content of this URL
final DocumentsReader reader = new DocumentsReader();
try {
listing.addAll(reader.readUriContent(uri));
} catch (UrlDoesNotExistException e) {
LOGGER.error(e.getMessage());
throw new CommandAbortedException(RESPONSE_CODE_NOT_FOUND,
e.getMessage());
}

} else if (DEPTH_INFINITY.equals(depth)) {
throw new CommandAbortedException(RESPONSE_CODE_NOT_SUPPORTED,
"Infinity depth not supported");
}

// response container
final Response propfindResponse = new Response(listing, uri);

// serialize response so that it can be converted into WebDav response
final XStream serializer = new XStream();
serializer.alias("response", Response.class);
serializer.alias("uri", MaterializedUri.class);
serializer.alias("resource", Instance.class);
serializer.alias("resource", BasicRepositoryResource.class);

final String listingSerialized = serializer.toXML(propfindResponse);

// dump object response for debug purposes
if (LOGGER.isDebugEnabled()) {
...
}

final DocumentBuilderFactory documentBuilderFactory =
DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);

final DocumentBuilder documentBuilder;
try {
documentBuilder = documentBuilderFactory.newDocumentBuilder();
} catch (ParserConfigurationException e) {
LOGGER.fatal(e);
throw new ServletException("JAXP initialization failed");
}

// lock-in to Saxon as it's the only one who provides support for XSLT 2.0
final TransformerFactory xsltFactory = TransformerFactoryImpl.newInstance();

// parse object response for further transformation
final Document listingDom;
try {
listingDom = documentBuilder.parse(
new
InputSource(new StringReader(listingSerialized)));

} catch (SAXException e) {
LOGGER.error("Unable to load serialized listing into XML DOM.", e);
throw new ServletException("Unable to load serialized listing");
}

// create PROPFIND response
String responseContent = "";
InputStream responseXSLT = null;
try {
responseXSLT =
CommandPROPFIND.class.getResourceAsStream("propfind-response.xslt");

final Transformer responseTransformer =
xsltFactory.newTransformer(new StreamSource(responseXSLT));

// WebDav request body
final Document requestBody = RequestUtils.parseRequestBody(request);

// let XSLT be able to work with original WebDav request
responseTransformer.setParameter("dav-request",
(null != requestBody) ? requestBody
:
documentBuilder.newDocument());

// convert object response to WebDav response (see XSLT for details)
final StringWriter responseBuffer = new StringWriter();
responseTransformer.transform(new DOMSource(listingDom),
new
StreamResult(responseBuffer));
responseContent = responseBuffer.toString();

} catch (TransformerException e) {
LOGGER.error("Unable to create PROPFIND response", e);
throw new ServletException("Unable to create PROPFIND response");
} finally {
if (responseXSLT != null) {
responseXSLT.close();
}
}

// respond to the client
response.setStatus(RESPONSE_CODE_MULTISTATUS);
response.setContentType("text/xml; charset=UTF-8");
response.setContentLength(responseContent.getBytes("utf-8").length);
response.getWriter().print(responseContent);
response.flushBuffer();
}


Here’s the request XML:


<?xml version="1.0" encoding="UTF-8"?>
<a:propfind xmlns:a="DAV:" xmlns:b="urn:schemas-microsoft-com:datatypes">
<a:prop>
<a:name/>
<a:parentname/>
<a:href/>
<a:ishidden/>
<a:isreadonly/>
<a:getcontenttype/>
<a:contentclass/>
<a:getcontentlanguage/>
<a:creationdate/>
<a:lastaccessed/>
<a:getlastmodified/>
<a:getcontentlength/>
<a:iscollection/>
<a:isstructureddocument/>
<a:defaultdocument/>
<a:displayname/>
<a:isroot/>
<a:resourcetype/>
</a:prop>
</a:propfind>


An example of the response XML mentioned above:


<?xml version="1.0"?>
<response>
<uri>
<requestUrl>/easydoc</requestUrl>
<isInitialized>true</isInitialized>
</uri>
<listing>
<resource>
<name>instance1</name>
<rootServiceUrl>t3://instance1.mycompany.com:7001</rootServiceUrl>
<type>0</type>
<uri>/easydoc/instance1</uri>
<properties/>
</resource>
<resource>
<name>instance2</name>
<rootServiceUrl>t3:// instance2.mycompany.com:7010</rootServiceUrl>
<type>0</type>
<uri>/easydoc/instance2</uri>
<properties/>
</resource>
</listing>
</response>


And finally the XSLT to wrap it all together:


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:D="DAV:"
xmlns:saxon="http://saxon.sf.net/"
exclude-result-prefixes="saxon">

<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<!-- WebDav request body -->
<xsl:param name="dav-request"/>

<xsl:template match="/">
<D:multistatus>
<!-- generate a separate response
for every URL returned by the back-end process -->
<xsl:apply-templates select="/response/listing/element()"/>
</D:multistatus>
</xsl:template>

<!-- the very root URI when no repository instance is selected -->
<xsl:template match="/response/listing/null">
<D:response>
<D:href>
<xsl:value-of select="/response/uri/requestUrl"/>
</D:href>
<D:propstat>
<D:prop>
<!-- it's a collection of repository instances -->
<D:resourcetype><D:collection/></D:resourcetype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
</xsl:template>

<!-- repository resources that was serialized by reference (XStream) -->
<xsl:template match="resource[@reference]">
<!-- using saxon extension to evaluate dynamic XPath expressions -->
<xsl:apply-templates select="saxon:evaluate(@reference)"/>
</xsl:template>

<!-- repository resource URI -->
<xsl:template match="instance | project[@class = 'resource'] |
module[@class = 'resource'] |
repositoryResource[@class = 'resource'] |
resource">
<D:response>
<D:href>
<xsl:value-of select="/response/uri/requestUrl"/>
<xsl:value-of select="if (position() &gt; 1)
then concat('/', ./name)
else '/'"/>
</D:href>
<xsl:choose>
<!-- PROPFIND request for all properties with their values -->
<xsl:when test="$dav-request/D:propfind/D:allprop">
<xsl:apply-templates select="./properties"/>
</xsl:when>
<!-- PROPFIND request for names (no values)
of all avialable properties of the resource -->
<xsl:when test="$dav-request/D:propfind/D:propname">
<xsl:apply-templates select="./properties">
<xsl:with-param name="with-values" select="false()"/>
</xsl:apply-templates>
</xsl:when>
<!-- PROPFIND request for specific properties -->
<xsl:otherwise>
<!-- set of existing properties -->
<xsl:variable name="prop-names"
select="./properties/entry/element()[1]/text()"/>
<!-- set of requested properties -->
<xsl:variable name="req-prop-names"
select="$dav-request/D:propfind/D:prop/element()"/>

<xsl:apply-templates select="./properties">
<!-- create filter not to include properties
not requested by the client. -->
<xsl:with-param
name="filter"
select="$prop-names[. = $req-prop-names/local-name()]"/>
</xsl:apply-templates>

<xsl:variable name="not-existing-props"
select="$req-prop-names[local-name(.) != $prop-names]
[not (local-name(.) = $prop-names)]"/>
<xsl:if test="count($not-existing-props) &gt; 0">
<D:propstat>
<D:prop>
<xsl:copy-of select="$not-existing-props"/>
</D:prop>
<D:status>HTTP/1.1 404 Not Found</D:status>
</D:propstat>
</xsl:if>

</xsl:otherwise>
</xsl:choose>
</D:response>
</xsl:template>


<!-- provide object properties -->
<xsl:template match="properties">
<xsl:param name="with-values" select="true()"/>
<xsl:param name="filter"/>
<D:propstat>
<D:prop>
<xsl:for-each select="./entry">
<!-- check if needs to be filtered -->
<xsl:if test="not($filter) or
(current()/element()[1]/text() = $filter)">
<!-- force namespace prefix as otherwise Explorer
doesn't understand the response -->
<xsl:element name="D:{./element()[1]/text()}"
namespace="DAV:">
<!-- check if property value needs to be provided -->
<xsl:if test="$with-values">
<xsl:value-of select="./element()[2]/text()"/>
</xsl:if>
</xsl:element>
</xsl:if>
</xsl:for-each>
<!-- all resource types except 3 (file)
assumed to be collection resources -->
<D:resourcetype>
<xsl:if test="parent::element()/type != 3">
<D:collection/>
</xsl:if>
</D:resourcetype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</xsl:template>
</xsl:stylesheet>


The result response looks like:

HTTP/1.1 207 Multi-Status
Pragma: No-cache
Cache-Control: no-cache
Expires: Thu, 01 Jan 1970 03:00:00 EET
Content-Type: text/xml;charset=UTF-8
Content-Length: 931
Date: Thu, 22 Mar 2007 15:14:14 GMT
Server: Apache-Coyote/1.1


<?xml version="1.0" encoding="UTF-8"?>
<D:multistatus xmlns:D="DAV:">
<D:response>
<D:href>/easydoc</D:href>
<D:propstat>
<D:prop>
<D:resourcetype>
<D:collection/>
</D:resourcetype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
<D:response>
<D:href>/easydoc/instance1</D:href>
<D:propstat>
<D:prop>
<D:resourcetype>
<D:collection/>
</D:resourcetype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
<D:response>
<D:href>/easydoc/instance2</D:href>
<D:propstat>
<D:prop>
<D:resourcetype>
<D:collection/>
</D:resourcetype>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
</D:multistatus>


Debug and Troubleshoot

To debug you would need the HTTP sniffer tool. You would need it a lot. It’s the only way to see what exactly is happening behind the scenes between your application and MS Explorer client. Oh, by the way, in order to open your repository you can either set up a new Network Place or just do File -> Open and opt-in for “Open as a Web Folder” in Internet Explorer.

When something doesn’t work and you can’t understand why I would suggest you look at how SVN responds to similar Explorer queries and compare with what your application sends back to the client. Try to guess how Explorer expects your mixed files and folders be organized in the response to PROPFIND command? I mean whether it matters what comes first and if yes, what has to come first and what happens if you do it not the way it expects you to do…

Lessons Learned

Your files (leaf elements) must come first. Folders (collection elements in WebDav glossary) must follow. If you don’t do it this way Explorer will show nothing and alert some weird message. And the bad thing is, Explorer doesn’t tell you what exactly was wrong with your response. It also tells you nothing about the status messages your server responds with in case of errors. May be I missed something but looks like the only way to get more out of it is to go for custom plug-in like SVN guys did with Tortoise. The same thing if you want to work with CHECKOUT and other version control features. No such command available in Explorer context menu.

I was impressed when I learned how Explorer creates new folders. I thought the “New Folder” is temporarily created in memory and gets flushed to the disk after you provided the actual name and hit Enter. Not the case with WebDav (I can assume it’s also not the case with regular file system). With WebDAV Explorer does the following sequence: HEAD (check if exists), MKCOL (creates folder with name “New Folder”), and finally MOVE (rename to the actual name provided) or DELETE (if you cancel new folder creation).

There’s one limitation with the approach in general related to the nature of WebDAV. Every resource is supposed to be identified with the unique URL. Imagine you have two files (or whatever entities you have in your repository that you represent as WebDav resources) of the same name. They apparently would have different IDs somewhere in your system but share the same display name. Your PROPFIND response can have such “duplicates”. But you won’t be able to figure out which one your user double clicked on because the only thing you receive from the user is the URL. I wish I could add some custom property to the WebDAV resource that Explorer would memorize and send back with the next request. As a way out you could add the ID to the name but your user would see it. It was not an issue for my tool and the repository I connected to but it’s anyway good to know and keep in mind.

Security

Just a few words about security. Because WebDav is just an extension to HTTP and because we handle it by means of Servlet implementation you can easily implement your security as you would do for a regular web app. I did it this way:

  • Secured the /* resource for all WebDAV methods for some dummy role in the security-constraint element in web.xml

  • Requested basic authentication method in login-config

  • Developed custom JAAS LoginModule that does nothing but remembers user credentials in some place and let the user in

  • Plugged in the LoginModule according to Tomcat guidelines

  • Run actual authentication using your repository API when user access the protected resource. You can do it on the previous step without a need to “remember” credentials



The Story Ends

This Tool took out all my evenings for about 2 weeks plus 1 weekend. My family blessed it only because I really wanted it. But… the reward was as unexpected as it was great. The CTO of the company decided to host a contest for the best plug-in for the project management system that would do some stuff with it using the new API (I bet they wanted to test it :). I submitted my tool. The 1-st price was Apple MacBook Black. Guess what... I got it.