xmlpages: 2007

Sunday, July 22, 2007

next steps for exyus

now that things are pretty stable (codebase-wise), i'm putting together a list of 'next step's for the framework. here's what i have so far:

add support for feeds
this will include publishing a feed from the blog (or other data source) as well as the ability to consume feeds from other sources. consuming feeds will also mean support for caching the feed locally and transforming it as a standard data source.
add support for external editors
most likely this will be in the form of the metaWeblog API. i figure the ATOM format is really more 'geek-friendly' but the metaWeblog API has been around for several years and there's lots of support for it 'in the wild.' I'm thinking of using MSFT's Live Writer as a test-bed for my implementation, too.
add support for handheld posting
this would allow me to post from my moto-q phone or any other html-aware small form-factor device. i think this means a pure html approach (no javascript). shouldn't be too much of a problem. the authentication might take some thinking (a simple user/password screen that does the real basic auth on the server?), tho. also, it will be a post-only pattern (no editing existing posts, deleting, etc.) that does not support markup (no p-tags, et al).
add support for DIGEST auth
this has been on my list for a while. it'll be a bit gnarly (my BASIC auth pattern is kinda buried into the code now) so i've put it off for a while. but i have a working example of DIGEST support for C#/ASP.NET already. so it'll be a relatively boring (i hope) day or two of hacking up the code and getting all the bugs out.
move users and auth to db
current the user list (small) and permissions list (kinda growing) is stored in two XML files. this works well for now and is easy to work with (and cached, too). eventually, this should be moved to the SQL-DB, tho. not a big rush on this one. it's another example of something that will happen behind the scenes.
add SSL support
first, i need to post the code to a public server. second, i need to get a simple SSL cert installed. finally, i need to make sure my existing code works well with SSL. should be no big deal. the only 'magic' might be getting the exyus framework to force SSL for certain URLs. not sure if that's needed, but i can see benefits.

well, that should keep me busy for a while!

sweetening exyus

i did more work this week to sweeten the exyus framework.

304 fixed

first, i fixed my broken 304 handling. when i was storing all cached data on disk, the 304 handler would compare the file timestamps against the if-modified-since and etag headers (serialized) to determine if 304 was the proper response. when i moved the cache into memory, my 304 checking was broken. i would return data from the cache, but always return false for 304 checking (no file data anymore!). to fix this, i resurrected a simple cache object that has timestamp, etag, and payload properties. now all resource entries in the memory cache are cache objects and my 304 checking is working again. sweet!

support for HTTP HEAD

second, i added support for the HTTP HEAD method. actually, it was embarrassingly easy. i just call the GET method and then toss out the response content. this allows for all the usual good stuff (304 checking, marking the item w/ last-modified, and etag, etc.), but doesn't return a body. now i have solid HEAD support!

cache invalidation implemented

my other big task was to implement a simple cache invalidation pattern. i did thsi by implementing support for the cache-control:no-cache header in my GET requests. that means anyone can issue a GET request to a URI with the cache-control:no-cache header and that will force the system to invalidate any existing cached version of that resource and return a fresh one. of course, along the way, this fresh version will get placed in the cache for future calls! *and* (to make it all really nice), you can even use HEAD instead of GET!

simplify data object declaration

finally, i tightened up the initial declaration for data object. no longer do you need to declare each verb XSD/XLST set. you just supply a single directory location that holds the [verb].xsd and/or [verb].xsl files. this emphasizes the standard interface concept for HTTP/REST and simplifies/shortens the declaration code. now, a data object declaration looks like this:


    [UrlPattern("/data/websites/(.*).xcs")]
    public class websiteData : SqlXmlHandler
    {
        public websiteData()
        {
            this.ConnectionString = "mamund_personal_db";
            this.UrlPattern = @"/websites/(.*).xcs\??(.*)$";
            this.DocumentsFolder = "~/documents/websites/";
            this.XHtmlNodes = new string[] { "//description" };
            this.ClearCacheUri = new string[] 
                    { "/blogging/", 
                      "/blogging/websites/", 
                       "/blogging/websites/{@id}" 
                    };
        }

Saturday, July 14, 2007

getting to the next level

it's been a while since i posted here, but that's just 'cuz i been bizzy!

finally getting the exyus framework distilled to the important bits. i now have a single pageHandler class that handles standard HTML GET activity. it can be sub-classed and declaring a new page in the site looks like this (C#):


    [UrlPattern(@"/blogging/.xcs\??(.*)$")]
    public class home : pageHandler
    {
        public home()
        {
            this.TemplateXml = "/blogging/index.xml";
            this.TemplateXsl = "/blogging/index.xsl";
            this.MaxAge = 600;
        }
    }

note i only need an xml document (that supports x:include, btw) and a transform (that supports exslt, etc.) and i'm rockin'

i also added a dataHandler class today. this can be used to produce XML resources that respond to the standard GET POST PUT DELETE interface. again, i distilled it down to a small bit of code that leverages XML,XSD, XSLT and a dash of TSQL. here's a typical class:


    [UrlPattern("/data/weblogs/(.*).xcs")]
    public class weblogData : dataHandler
    {
        public weblogData()
        {
            this.ConnectionString = "mamund_personal_db";
            this.UrlPattern = @"/data/weblogs/(.*).xcs\??(.*)$";
            this.XHtmlNodes = "//body";

            this.PostXsdFile = "~/documents/weblog/weblog-post.xsd";
            this.PostXslFile = "~/documents/weblog/weblog-post.xsl";
            this.PutXsdFile = "~/documents/weblog/weblog-put.xsd";
            this.PutXslFile = "~/documents/weblog/weblog-put.xsl";
            this.DeleteXslFile = "~/documents/weblog/weblog-delete.xsl";
            this.GetXslFile = "~/documents/weblog/weblog-get.xsl";
        }
    }

again, it's all about creating the proper schema and transform files. of course, there' some T-SQL in the background, but that's another story.

the point here is that it's now pretty simple to create a web site that supports all the standard (X)HTML stuff as well as acts as a good HTTP netizen. most of this is following the REST model (with liberties along the way).

finally, i implemented a no-cache pattern last week. now, if you do a GET using the cache-control:no-cache header, exyus will ignore any cached value and rebuild the request (caching it along the way, if called for).  using the header is a bit cumbersome for html clieints, but works fine for scripting and ajax work.

i need to exercise the new upper-level classes (dataHandler mostly) but they should be solid. next, i want to work more on the requestor class (to make internal http requests) and then look at wrapping things up for the first post to a public server!

Friday, June 29, 2007

finally solved a terrible bug

two weeks ago i encountered a *terrible* but that marked my entire rendering system void. basically (without going into all the upsetting details), any item delivered from my cache was causing errors in MSIE (Switch from current encoding to specified encoding not supported.") and forced FF/Safari to render the plain XML instead of the resulting web document. *crap!*

i struggled with various work arounds trying to isolate the problem, but could *not* find the answer. i am using a mix of XML from the database, XML from a template document and XSL transforms. add to that a sprinkling of caching the generated document to disk for quick reply and i was having a devil of a time sorting it all out.  in fact, after several days of no luck, i just put the thing aside for about a week. 

well, i started in again last night and - although i got almost no where last night, the solution finally hit me this afternoon. it *was* an encoding issue, but not where i thought it was. not in the database data, not in the template xml, not in the cache file itself, but in the xsl transform!  i noticed that a few of my templates worked just fine. but one set (my blog templates) did not. the answer? my transforms for the blog site was **missing the xml version/encoding declaration at the top of the page**!

yep - bonehead play. i was too quick to build my blog templates (in a single night about three weeks ago) and did not fully test them at the time. they worked while i had caching turned off (i do this while developing). it was only after i tunred cahcing back on that the problem came up. apparently, the encoding declaration in the transform is critical to the way the final document is output and (in my case) stored to disk.

once i sorted this out, things are working quite well (sigh).  of course, now i will need to do some serious testing to make sure i reach into all the corners, but it's much better now.

man - what a bonehead!

Tuesday, June 19, 2007

refining the new dispatch model

i made some additional progress on the new dispatch model for the framework. i now have the templateHandler in the original Exyus assembly. i also have a separate Exyus.Blog assembly to handle the blogging details. i had to tweak some regex string in order to fix a bug with the arg-handler, too.

now, i just need to press the edges some more over the next few days to make sure it's all cool. all looks good. but i could proly get a lot more out of this setup if i had better regex chops.

next step - implement the other template pages for the blogging app.

Monday, June 18, 2007

two big breakthroughs this weekend

i made two critical breakthroughs this weekend on the framework:

- support for passing arguments in xml documents

- simplified dispatch pattern for the handlers

argument replacements in xml templates

first, i can now support a simple argument pattern within xml documents that are treated as templates. for example:

<x:include href="/xcs/data/weblogs/?category=$category$" />

will match the ?category=1 from the request url and pass it along to any $category$ with the value of "1."

i also added a $_qs$ argument which represents the *entire* query string. so far, this has worked well for passing filters to the data-oriented requests. i'll continue to beat this up over the next several days to make sure it's all working as expected and without nasty side effects.

simple dispatch pattern

thanks to some serious inspiration from brad wilson, i now have a much more scalable dispatch pattern using a custom attribute and reflection. now, instead of having to craft a special handler factory to respond to all xcs calls, i have a single factory that can san all loaded assemblies for eligible handlers/factories that can respond to requests.

it works by decorating my handler (or factory) class with a new attribute:

[UrlPattern("/xcs/data/weblogs/(.*).xcs")]

and there's a single factory class that scans all loaded assemblies and catalogs all classes with this attribute. then, upon each request, the collection is checked and, if a match is found, the associated class is loaded and executed for that request.

again, i need to beat it up a bit to make sure it covers all the bases, but - so far - this is working quite well.  hopefully, it will scale well as the number of handlers increases for a single web app - time will tell.

so, improved dispatching and solid arg-passing for xml templates (btw - i have solid xml templates!). that covers another set of important features.

i'm now very close to completing the templates for the blog. once that is done, i need to put together the editor tools.

Friday, June 15, 2007

improved the date filtering for the blog

i added a number of new 'reserved' words to date queries for the blogging app.

today

thisweek

thismonth

thisyear

yesterday

lastweek

lastmonth

lastyear

ayearago

now you can query the data like this:

/data/weblogs/?thisweek

and see all the posts on file for this week

not bad

i also finally sorted out support for secured templates. i've created a new space /templates/ that requires http auth - right now only the site admin has it. and the runtime uses siteadmin rights to load anything via includes. that means that any direct calls by clients will prompt for user auth. any *internal* calls via includes, etc will always be decorated with siteadmin user/pass and will work without trouble.

pretty slick!

i plan on adding a robots file to prevent any attempts to crawl that space, too.

i also did some work a a simple template this past week. it's pretty clean. i think it will work for starters. now i just need to sort out the final details for the display side and implement it locally. proly take a day.

then i need to add the editing UI. i think i need to step up and use the dhtml components i've been work on at the office (html edit, calendar control, sort/page table, tabs, etc.) this should make the ui more interesting with less hackery, too.

so i'll proly spend a few days getting those all working smoothly, too.

it's inching along.

Tuesday, June 12, 2007

the value of 3xx

in my reading of the rest-discuss list and digging around in general, i am starting to realize that the 3xx response code can be very valuable in dealing with proper handling of situations.

301 - moved permanently

does iis handle this at all?

302 - moved temporarily

shuttle the request to another uri for this moment - iis handles this one (not always well)

303 - see other

commonly used to redirect POST to another page - not using this at all, might be quite handy

304 - not modified

the resulting request has not changed since you last asked - i use this alot now

gotta get my hands into the 303!

Sunday, June 10, 2007

stlled, but learning

i'm kinda stalled on upgrading my rest-ian framework. the initial templating system is working, but i need to iron out and sand off the rough edges on my security implementation for embeded templates (templates calling templates...).  it should be working fine in a few days.

in the meantime, i helped jesse implement another feature on his web site. although he is not using the 'full-blown' rest-ian model, most of the recent changes i've implemented for him use rest-like patterns and lots of xml/xslt work. he's actually getting the hang of working with xsl transformations, too.

finally, i stumbled upon a couple related items on a rest email list this week and it's got me thinking:

Megginson Technologies: Quoderat

Lost Update Problem. This doesn’t have to do with REST per se as much as with the choice not to use resource locking, but since REST people tend to like their protocols lightweight, the odds are that you won’t see exclusive locks on RESTful resources all that often (it also applies to some kinds of POST updates as well as PUT).

Reliable delivery in HTTP

Achieving reliable delivery in HTTP is not difficult. It takes a little bit of understanding of how HTTP works and what reliable delivery means.

both of these items point out common problems with any remoting/async system. and they both have very simple solutions!  i really like the version id solution for the lost update. i am still working on the details of reliable delivery using POST (or RPOST).

but they bothremind me that there are a number of additional things to consider before i claim a 'complete' framework!

Wednesday, June 6, 2007

more on hypermedia and the engine of app state...

been running into several other posts that relate to this issue (see posts below)

one thing that is starting to sink in is that the hypermedia aspect relates closely media types.  in a nutshell, if replies from the server express media types and links, i should be able to figure out what kind of representation is allowed/expected at each link (end-point) along the way.  if a system (client) understands a collection of media-types, it stands to reason that the client can make GET to the server to retrieve a list of links and media types and then be able to compose a PUT/POST based on the media-type and pass it to one or more of the links.  i think i'm getting it...

An Opinion? Well, if you ask...: D'oh! REST already had contracts.

The contracts and protocol exist solely in the data. A client that understand a media type can then navigate the web of links based on knowledge gleaned from documents conforming to that media type. (This is exactly hypermedia as engine of application state).

Joe Gregorio | BitWorking | Do we need WADL?

Everybody's atwitter about WADL, a description file for REST services, and since it's supposed to be RESTful I regularly get questioned about it.

Shopping - SimpleWebServices

In the REST world, shopping revolves around application state - transitioning as required.

Tuesday, June 5, 2007

hypermedia as the engine of application state

i don't yet 'grok' the last one in the list:

REST is defined by four interface constraints:

- identification of resources

- manipulation of resources through representations

- self-descriptive messages

- hypermedia as the engine of application state

i'm learning, but i'm not quite there yet.

Saturday, June 2, 2007

big leap forward today

i made a big leap forward in the rest-server framework today. i added support for using xml server-side templates that support x-include and other good stuff. that means it's possible to define an xml template that includes several calls to the /data/ portion of the system and then apply a transform to it on the server. it generates clean xhtml and sends it to the client (optionally caching it on the server, too).

all very sweet.

i ran into a few (not unexpected) hurdles along the way. first, since the /data/ part of the system is secured, templates calling into the /data/ area always failed rudely unless the user had already authenticated. i decided to implement a server-side system-user account that can access the data quietly as needed. this is the road of least-friction. i also plan on implementing a version that will force the user to log in first. that will take a bit of doing. i can easily write an x-include fallback, but i also need a clean way to pass credentials (i think) once the user has logged in. it's on my list of things to check out.

another stinger was the whole xml:base resolution deal. i decided to keep my xml templates in a folder not directly accessible to external users (prevents hacking). but that meant for each request, i needed to 'adjust' the internal path to point to the template pile. that - in turn - broke the xml:base for the x-includes. as a result, i now diddle the url as it comes in to make sure it matches the one for the intial template. now it all seems to work fine. this may not be the correct solution, but it works.

i also added suport for xhtml validating of user input on the server. i use the built-in validator for xml, but using a DTD/Schema instead. works great. now i can make sure saved data does not end up hosing the xml/xsl templates. i did run into a bit of trouble, tho. seems my escapeHTML javascript hack converts the input to HTML (not xhtml) each time i *read* it in. this is a bummer. i need to check into that since it means each time i edit an existing record, all my cool xhtml changes are removed.

finally, i did a bunch of code clean-up and some object renaming. i've been meaning to do that for quite a while, but just never took the time. now it's much cleaner (references to amundsen and REST are now gone) and i've moved a few things from local vars into const and such.

so, i have a solid templating solution started. i need to beat it up a bit before i will rest, tho [grin]. i figure building the public side of the weblog app will gfive me enough chance to expose weaknesses in the model.

onward and upward!

Tuesday, May 29, 2007

improved lis queries for the weblog app

i added a handful of improved list queries for the weblog app today. these include the following

/data/weblogs/?recent - returns the most recent five entries

/data/weblogs/?category={@id} - returns all the entries for the requested category id

/data/weblogs/?year={@yyyy}&month={@mm} - returns all the entries for the requested year & month

i also added some list filters for the website list:

/data/websites/?recent - returns the ten most recent website entries

/data/websites/?category={@id} - returns all the entries for the requested category id

this will make building a typical blog home page easier and more consistent.

i updated the handler code and had to twiddle the isapi rewriter regexp, too. the rewriter update we more tedious then the code update!  anyway, after making changes to the code and rewriter, the /data/* calls all work as expected. my next task will be to define the 'public' URI patterns and implement handlers for them, too.

i also need to confirm that the framework code is using the MVP XML library - specifically support for x:include. i like the x:include pattern since it makes it very easy to build server-side pages that contain several 'xml parts.' this also means that i need to make sure other things like state and security will work with x:include. based on my past experience, this should go fine, but i need to dig into it to be sure.

anyway, things continue to move along well. assuming all goes fine, it's quite possible that i'll have the blog ready for data entry by the end of this coming weekend.

Monday, May 28, 2007

added support for the website objects

i added support for the website objects today. this will be my 'blog roll.' for the blog.  that rounds out my initial set of objects for the blog:

- weblog

- category

- website

i tested them all against the data api and all work as expected. i think i need to update the _list function a bit. maybe return a default top X number unless the 'all' argument is passed, i think. i'll add some other filtered list requests later - by year/month, by category should be good starters.

i think i need to add the top X return to the website_list, too. i'll proly work that out this week.

then it's on to the 'live' version of the site. for starters i'll define the following:

- home page

  + last 5 full posts

  + list of categories

  + list of websites

  + link to get full list of posts (reverse chrono order)

later i'll add some other stuff like:

- contact page

- posts by category

- list of all websites (full blogroll)

i need to implement a url pattern for the posts that work on a 'hackable' model:

- /posts/yyyy/mm/post-id - returns a single post

- /posts/yyyy/mm/ - returns list of posts for that month

- /posts/yyyy/ - returns a list of posts for that year

- /posts/ - returns all posts

this probably will be handled via the rewriter

/blogging/yyyy/mm/

/blogging/?year=yyyy&month=mm

then i just need to implement the filters for the _list query.

i also need to get a good editor working for the blog posts.

- edit weblog entries (uses a category list)

- edit category entries

- edit website entries (uses a category list)

well - that's all for now.

Saturday, May 26, 2007

started my weblog implementation

finally got the courage to start to implement my weblog in the new REST framework <ta-da!>.

i was able to implement the database side (weblog, category, websites) in a little more than an hour. of course, i'd done some planning ahead of time so the implementation was just 1-2-3 simple. that included the table definitions as well as the standard sprocs that produce xml output from sql server (_list, _read, _add, _update, _delete).  later, i suspect i need to add some customized _list sprocs (most recent, search by date, etc.).

i was able to build each http handler in about 30 minutes. this involved copying an existing handler and modifying a few references to XSD and XSL files. then actually creating the XSD to validate the input and XSL to convert the XML input into acceptable T-SQL to submit to the database server. i got the weblog and category classes completed. i still need to implement the website class, tho.

also, i set up an endpoint for the weblog and the category classes. that meant updating my handlerfactory class and modifying the auth-url and auth-users xml files to control access to the new endpoints.  that took just a few minutes.

finally, i copied my stock html-ajax page that allows me to target the endpoint and do add/update/delete actions. this is my basic test bed. that took me less than 30 minutes to handle both the weblog and teh category. i was able to create sample data for both classes - no problem.

i also took the time to re-org the endpoints. now, all the actual data access is done through the /data/* URI. originally, i had the data behind each app (/running/runlogs, eating/foodlogs, etc.), but as the data library grows, i think it will be easier to control (esp. access control) if it all appears under a single /data/ endpoint (/data/foodlogs/, data/runlogs/, etc.). i'm not sure how deep to make the hierarchy - there's no limits, i guess. it will take time and some experimenting, but i'll get it.

i need to implement the website endpoint (handler and test page), but that should take less than 30 minutes. once that is done, i'll be ready to implement a 'live' display and edit service for the weblog.  i'm thinking that i will finally start to integrate the DOM/DHTML UI components that i scouted over the last few months (tabs, tables, tree, etc.).

anyway, it's been a good week. doing this next app has a bit more complexity and it's giving me a chance to expand the features and value of the rest framework. all fun stuff!

Monday, May 21, 2007

updated my food log app

i updated the foodlog app to look/feel similar to the runlog app i updated last week. it now has a slicker public side and a bit smarter editing experience. they bos support html content for the notes section and i'm starting to use things like anchors and image tags to spruce up the content - looking nice.

i want to run it a bit longer here on my workstation iis edition before packing it up for the public server. but i suspect the public server will happen before the end of the month. 

in the meantime, i am starting to plan out a simple blog app. the first pass will have entries and categories. i'm considering a blog roll and comments, but i might use an available service for those (get to test my ajax/rss chops!).  anyway, the app will be a bit more complex than the 'log' apps i'm using today. just the right step up in my process for building rest-ian apps.

on another front, i've stopped working on the http auth side of things. while basic-auth is working, i have no ssl cert to guard my transactions. i also haven't implemented digest auth. that is probably what i should focus on next. i also want to tighten up the session and cookie services baked into the framework. finally, the user accounts and url auth data is currently stored in xml files. that will need to be moved to a db relatively soon, too.

it's looking good and solid - i like it!

Sunday, May 20, 2007

moved exyus runtime to root folder

made another big step today in rolling out a working version of exyus - moving it to the root folder of a server.

actually, a sub folder. but the point is that the app no longer runs under the asp.net debug server. i implemented some isap rewrite rules, modified the xsl and scripts and now all runs fine as a standard iis-hosted app. it's all about the configs now.

i tweaked my running log app to use a bit of javascript (running.js) and a clean css file (running.css). i need to do the same with the food log app. once that's done, both apps will be fully functional and stand-alone.

next, i need to package up the two apps and post them to my public-facing server that uses sql server (not the express flavor i have here).  at that point, the release will be complete - a fully functional rest-ian lightweight web framework.

once the thing is hosted publicly, i just need to start designing and implementing other rest-ian web apps. my next target - a blog app. then i can test my rss/atom output skills, too!

Tuesday, May 15, 2007

implemented a new 'app' in about two hours

had a good time this weekend. i implemented a second simple http/rest web app in about two hours. this time, instead of my jogging log, i created a simple weight loss log.  while it was really  simple (just one table with a handful of inputs) it was important to me to create a new app from scratch to test out the rest-server pattern io've been working on for the last month or so.

and it went well. i did do a lot of copy and paste (one reason it went so fast). that means i might be able to factor in some of the copy-paste into the base class. but i want to make sure not to 'over-engineer' things. i implemented the xml data uri, then a simple xhtml list/detail pattern for public viewing. setting up the security was a bit tricky at first (forgot some of my details for setting up secure urls and granting user permissions), but it fell into place once i dusted of the ideas. currently the security is all done via XML files, but i can see how doing this in a db would make things smoother.

finally, i also updated the running app by cleaning up the uri pattern and refreshing the public pages. it took tiny bit of work to update the XSLT and a bit of trundling the javascript client code. but it's all good.

i'm having some fun here. my next target? posting all this stuff to a public server instead of just on my workstation.

Thursday, May 3, 2007

side track into client ui

i've been spending the last several days (at work) putting together some web 2.0 ui pages. this is focus on a POX (plain old xml) back end and ajax-inated javascript browser front end.  been kind fun, actually.

in the process, i've decided to build a small (but useful) DOM-friendly collection of ui components. i'm focusing on the following:

- tree control

- table (sort-able, filter-able, page-able)

- tabs

- menu (fly out and drop-down)

- modal dialog

- collapser (toggle page sections open/closed)

- auto-suggest (ala google, et al)

i also want to settle on a calendar and dhtml editor, but have not found anything i like yet.

also, i've worked up a basic 'behind-the-scenes' set of javascript libraries to make my life easier:

- dom helper (base from dean edwards)

- mozxpath (nice cross-browser xpath against xml docs)

- ajax library (built on the one from ajax patterns wiki)

- data library (found a decent one i need to work on)

so that's a nice set to starat with, eh?

so far, i've done enough work with the ajax, and mozxpath to know they work great. i like dean edward's base2, but it's a bit early to tell if this will last.

i also posted a clean modal dialog and auto-suggest set in my googlecode space.  i plan on posting the other parts as they get exercised (or exorcised, as the case may be<g>).

fwiw, i still have work to do on the server-side rest library. but that might wait a week or so as i get through this client stuff.

Monday, April 30, 2007

wrapping up http caching

after pouring over the http 1.1 docs and a number of other articles and books, i'm pretty sure i have the basic caching pattern down:

PUBLIC CACHING

to reduce traffic, use expiration caching (expires for 1.0 and cache-control for 1.1) to tell public caches to serve requests without talking to the origin server.

to reduce bandwidth, use validation caching (last-modified for 1.0 and etags for 1.1) to tell public caches to use conditional gets to the origin server who can return a 304 instead of returning a full copy of the resource.

so this is all cool. i have this built into the ws-loop now. if you are doing a GET and the class generating the resource has max-age, etags, and/or last-modified set, caching headers will be sent with the response. and whenever a class is handling a GET, it will check for if-not-modified and if-none-match and return 304 if the resource has not changed.

LOCAL CACHING

so that's the public side of things. under the hood, the ws-loop is creating a local copy (on disk) of the requested resource and using that to serve up to clients as long as the file-date is not stale (based on max-age).  so, while the class still has to respond (no direct file delivery here), it will still deliver a disk copy of the resource for as long as it is directed. that's cool.

GZIP/DEFLATE

now the only other item that could be intersting would be gzip and deflate.  i need to look into how to implement that for dynamic resources (when supported) using the .net 2.0 compression classes.

CLEAN-UP

once that's done, i have some housekeeping to do with the ws-loop. cleaning up naming, some constants, implementing the ws-loop at the server root using isapi rewriter, etc.

once the clean up is done, some other niceties like moving the user/auth work into the db instead of xml and maybe making caching an external config insteado code work would be good. then it's back to building REST-ian apps in general.

Saturday, April 28, 2007

implemented basic caching for the rest server tonight

after adding a simple write-to-file routine for the GETs in my REST server, i also implemented support for ETag, Last-Modified, and Cache-Control max-age headers.

now, when a resource is generated, if the max-age is over zero, the results are written to disk and caching headers (etag, last-modified, and max-age) are written to the client.

in addition, when a GET is received at the server, the routine can check for If-None-Match (for etags) and Use-If-Not-Modified (for last-modified) request headers and, if possible simply return 304 instead of regenerating the request.

none of this is marked public yet. i still want to work on some details of the cache-control header including things like must revalidate, freshness, and no-cache directives for cookie-d requests, etc.

still, the server is getting more robust and generating cache-able content. and that's good!

beat a perf challenge today

last night i added support for xhtml strict doctypes in my html outputs from the rest server project. i immediately saw a serious perf cost to doing this - wow!  it bugged me all night long and it was only this am when i finally had the time to work through the problem.

turns out that the cost is not in the transforming of the xml into xthml via xsl where the cost was - it was in reconstituting the output of my transformation from a string pile back into a new xml document (long story on all that...).

anyway, since i need to output the final product (xml, string, strong object type) as a serialized string pile to the browser, i just added some smarts to the base class that handles the http response to all for more than just an xml document or C# type. now it handles a string pile, too.

*big* improvement! now i know i'm able to generate strict xhtml and i still have solid performance.

next for me is to implement a local caching pattern that allows for support of public caching servers. it's all about the GET this weekend!

Friday, April 27, 2007

tim ewald has a lightbulb experience

I finally get REST. Wow.

The essence of REST is to make the states of the protocol explicit and addressible by URIs. The current state of the protocol state machine is represented by the URI you just operated on and the state representation you retrieved. You change state by operating on the URI of the state you're moving to, making that your new state. A state's representation includes the links (arcs in the graph) to the other states that you can move to from the current state.

Wednesday, April 25, 2007

share nothing architecture

as i work toward a rest-ian model for my web apps, i am reminded of a simple rule of thumb when implementing web content: share nothing.

that means don't assume cookies or session that 'binds' a request or series of requests to a web server. as long as each request is completely stand-alone, it can be delivered from any server that has a copy of the requested resource.

again - thinking in terms of resources (not objects or pages) is the key. when a user makes a request the origin server will need to resolve the request into a (stand-alone) resource. once this is done, that resource can be stored - anywhere (including a third party caching server. then this can be replayed from the stored location.

the only tricky part there is aging the cached item. since i want to support third party caches, i can't rely on local 'dfity bits' to clear the cache when data changes. besides, that's a kind of 'not-shared-nothing' approach! instead i need to set a maxage value for caching and/or use an etag model. that way, when requests are made, the cache (local or remote) can properly sort out the details.

when we talk about third parties, things like forced validation and other issues will come into play, but i don't need that right now. what i need to focus on is a clean and simple private caching pattern using maxage and tags. then i can move out from there to public caches.

again, the key is to make sure i use the 'shared-nothing' approach when composing a resource. then it's easier to replay.

auth - now that's a diff story...

Monday, April 23, 2007

crockford made a good point about compressing js files

his major beef with the common js compress utils that remove whistspace *and* obfuscate the code by renaming vars, etc. is that the second step has the chance of introducing bugs.

i know that some of the compressed files i work with always *bork* firebug or ms script debugger, too.

he also pointed out that using gzip is a big way to reduce bandwidth - good point.

i guess i need to add a quality whitespace remover and some gzip 'judo' to my list of things to do.

Sunday, April 22, 2007

a new fix for the msie caching bug

ran into trouble with my ajax html client running in msie (7 in this case). it *refused* to refresh the pages it just edited! in other words, xmlhttprequest was not used to get the new data - msie served the data from a local cache instead!!!!!

i've sen this before and the most common fix is to add random data to the end of the URL to convince msie that this is a new url:

http.send("get",url+"?"+Math.random(),true);

it works, but is messy. i dug around for a while and turned up a posting on the msdn site that talked about this issue in some detail. the key point, msie honors a custom cache control header extension that allows web servers to control this odd behavior:

Cache-Control: post-check=[sec],pre-check=[sec]

both values are in seconds.

the deal is this:

when ie completes an http get request, it places the results in a local cache. then next time is is about to make the same request, these two values come into play. if the copy in the local cache is younger than the pre-check value, msie will deliver the data from the local cache. *also*, if the copy in the local cache is *older* than the post-check value, msie will (in the background) get a copy of the new resource from the server and save it do disk. that way the *next* time the page is requested, it will be newer, but it will still come from the local cache.

convoluted, but efficient.

so i added the following cache-control header to all my output:

CacheControl: post-check=1,precheck=2

the good news is that this worked like a champ. no messing with the url means i can better support third-party caching servers and still get msie to behave properly.

there are a handful of details to setting this up properly, so it's worth checking out the page at msdn.

built my first client for the rest server

i spent a couple hours building a simple CRUD app that allows me to manage my jogging log entries.

it went without much fanfare. i created a single page (/running/edit/index.html) that handles all the actions via ajax (no direct posting). this means there is nothing bookmarkable, but that's fine for the editor.  once i got the pattern down, it feels a lot like windows programming (wow!).  the ajax library i am using is from the ajax patterns book/site - highly recommended.

the basic implementation involves two uri:

/running/edit/

/runlogs/

the running/edit uri holds the single html document that does all the work. powered by lots of js and some key libraries (base2 & mozxpath along with my version of the ajax pattern lib). the runlogs uri holds the actual data. this is served via my rest-server. plain xml from a db table in this case.

i also made sure to secure the /edit/ uri to force authentication. that worked fine, too.

i learned a handful of things along the way. this is a dirt-simple editor. it's also rock-solid. fast, too. a good user experience.

of course, i have no meaningful css here and the gross-level ui experience is very basic.  anyone with decent design and ui skills could make improvements.

but the point here is that i was able to quickly build the full editor and it all works nicely. yay!

Saturday, April 21, 2007

cookies, session state, and rest (again)

now that i'm wrapping up my initial round of implementing a REST-ful web server coding pattern in C#, i am staring directly at the whole "cookies are nor REST" issue and working to resolve it as pragmatically as possible.

i even printed a copy of roy fielding's REST dissertation and amd working through that parts that contain his comments against cookies and session state.  on the surface, they make sense. basically cookies and session state can have the effect of controlling the way a URI is resolved (use the session state for filtering, the cookie for personalization of the page layout, etc). this means the resulting document can't be reliably cached or played back for other users (or possibly the same user). ok, i get that.

as a data bucket

so, my next round of thinking is *why* we got into the habit of using cookies and session state. first, they often are shortcuts - nothing more. i can stuff a value in a cookie and carry it around for a while (say, during a multi-page shopping experience) and then use it later. i can do this same thing using a server-side session bucket.  of course, the hitch on implementing server-side session state is that i need at least one cookie on the client to reliably link the client and the server session data.

as authentication

another common way to use cookies is to use them to handle authentication of a user. in other words, once a user presents credentials, a cookie is written to the client. then for every request, the server checks the cookie to make sure it's valid. you can even timeout the cookie to make sure users who leave their client unattended will eventually be 'de-authed.'

as an identifier

also, cookies are often used simply as an identifier. once a user logs in  (say using basic-auth or digest-auth), a cookie is created and passed to the client. this identifies the client for all susequent transactions. usually this is to help with tracking the user's actions on the site. sometimes the identifier is just a random value (commonly referred to as a session id) that is used purely for tracking purposes. it is then possible to playback a session using archives transactions kept on the server.

ok, data bucket, authentication, identifier...

auth not needed

i am working to use only basic-auth, and digest-auth for authentication. there is enough support in asp.net including httpmodules and the ability to access user and principal objects to make that all work consistently. i'm confident i don't need cookies for authentication. i just need to accept that the web pages will occassionally popup the 'rude' browser auth dialog<sigh>.

data bucket, i think i can deal with this

i understand the point, but need to noodle on this for a bit. some trival examples on the web invovle creating a we resource at a 'disposable' URL that a client can use as a data bucket during a session. i can see this working via ajax, but am not clear on how to implement it in a more traditional server-side html page environment. again with the 'composer' issue. i don't want to compose pages on the server that contain non-replayable, personalization, or private data that might end up in the cache. i need to work on it, but i can see the possibility.

identifier - i still think i need this

first, i've started implementing a simple session cookie that i use to track transactions in the session and to prevent simple replay issues ( i use some browser agent data as well as a random key). finally, i use a caching trick to timeout the sessino after x minutes of inactivity. by doing this, i can flip a flag in the code that will clear any personal data, start a new session, and force a new auth dialog (if needed). so i kinda really need that<g>.

second, while this random data helps me keep track of transactions by a client as well as timing out a client session, i still don't really know *who* this client is. for that, i think i need at least a 'friendly' name cookie or something.  not sure why i really need this, but i have a hard time letting this go.  the biggest thing that junps out is that, when using basic-auth, any non-authed pages are missing the auth data entirely. i suspect the same is true for digest-auth, but i'm not positive. so i think i need a 'name' cookie<sigh>.

as long as the session cookie and (if included) the name cookie are not used to control the content of the resource they are safe to use. it's only when state data is used to change the resource representation in 'hidden ways' that things are bad (non-cachable).

i think i need to check into ways to use caching controls to better track how a resource is dentified.

finally, i read a scary 'aside' while digging into this whole cookie-battle. it was regarding auth. something like 'tossing authenticate' headers around screws up third-party caching. hmmm... kinda makes sense. and is depressing.

ok, that's all for now. i'll soldier on.

can't forget ssl

while i continue to focus on getting the plumbing working for authentication and authorization in my REST server, i gotta remember that support for SSL should be on my list, too.

first, since i've only implemented basic-auth, ssl would be essential for any app that will live 'in the wild.' second, even w/ digest-auth added (sometime soon), sll will be very desirable from a privacy standpoint.

that also brings up the issue of supporting https patterns. i suspect i'll do this using an isap rewriter - not within the app loop itself. too much thrashing in the code, i think.

finally, i need to get ssl installed on my workstation. i think i can use msft selfssl package for starters. i should also get a (cheap) semi-secure ssl to install on the public machine.

ok - nuther thing to add to the list then.

major advance on the security side

i spent some time late last night and this am getting the next round of changes for the auth service. i now support a permissions model as well as a list of secured urls for the app.

HasPermissions(uri,httpMethod)

now, when the request is first received, the server will check the user's permissions collection and match the permission record with the requested uri. that will return the list of allowed httpmethods for this uri. sweet.  no more pinging the roles within the actual controller code. this is all done 'under the covers' before the controller even sees the request.

Secure Urls

i also added a way control which urls require authentication. for example, you might be able to see the home page just fine, but need to log in before you can edit your blog. again, this is all controlled by a list of path regexp patterns. the system will find the first match and then return "true" or "false" on whether authentcation is required to continue.

also, this is done completely under the covers - no controller code needed.

Guest

also, when auth is not required, the system will automatically load the credentials for the "guest" account. this makes sure that every request as a set of valid credentials to work with. it also allows me to control the access for guests. again, they might be able to GET, but not POST, etc.

Sessions

finally, i did some additional work on the session validation this am. now, i keep track of the session timeout (sliding value of 20 min for default). when the session has timed out, i refresh it (based on the existing 'old' cookie) and force a re-auth of the user. if this is a secured url, the user will be prompted again - nice. if it's a non-secure url, the guest account is refreshed quietly.

Wrinkles

first, i am doing this all w/ basic-auth. not a big deal, i think but i still need to implement digest-auth sometime soon. i'm hoping it will be relatively trivial to and branching code for that.

second, since each url will be auth'ed as either the prompted user login or the "guest", it's not clear how i'll be able to know what user is actually doing what action - esp. in un-auth'ed situations. while - security-wise - this is no big deal (i'll always have a valid security context for each request), it will make loging and some personalization a pita. i think i need to salt the session cookie with some other info that will make it easy to know who is here. once a user is prompted for a login, i should store some identifying data in a cookie for easy access (both server- and client-side). another job to do...

Caching

yeah, i did a lot of caching this am, too. i cache the list of user permissions, the list of secured urls, the session cookie (for timeout purposes), and i also cache each authurl that has been requested. this should keep the looping to a minimum at runtime.

well, that's a lot for this weekend - at least on the infrastructure side. i would still like to fill out some UI and general user cases to make things look like a real app/web page.

dealing with composing the html is the next big hurdle. i would like to use xml/xsl - even xinclude. but i need to look carefully at performance an other issues. as usual, a solid javascript-powered ajax page would always work well.

Friday, April 20, 2007

completed initial basic-auth implementation

i finally finished off the basic-auth implementation tonight. while i had the basic-auth request/response working, i had to user/password store and validation working. now i do!

this first pass uses a simple xml storage pattern with user/pass along with a list of associated roles for that user. the details are loaded on the first validation of the user and kept in cache throughout the session. i can now add a permission check at the top of each web method to check the role of the current user. if the role check fails, i return a 403 - sweet!

next step is to move away from site-wide role-based model and go straight for a uri/http-method model. the user store should have the uri (actually a regexp that can resolve to one or more uri) and a list of allowed actions (get, post, put, delete, head, option, *=all, !=none). this can all be done within the security loop *before* ever getting to the http handler code that implements the method (get, post, etc). that way, the entire security details (authentication and authorization) are outside the handler entirely.

need to do some work to grep the details of building a list of regexps for uri and a way to cleanly load and walk these uris at runtime.

of course, once the uri/action pattern is solid, i can implement a version of digest-auth, too!

Monday, April 16, 2007

initial solution for the composer dilemma

i've arrived at a tentative decision on the whole 'composer' dilemma regarding authorization and caching on the web. basically, the point is this:

you cache data, not apps.

or, put another way:

decide if your web page is data or presentation. if it's presentation, then it's not cache-able.

while this is probably artificially rigid, for now, it keeps my head straight.

i am also starting to think about making all 'web pages' apps (or app-like thingies<g>). for example, a blog home page is really a tiny app that presents the most recent post, a list of the last ten posts, and a blog roll. those are three data items (fully cache-able via REST-like URIs). the home page of the blog is really just some html and a handful of javascript ajax calls to retrieve the appropriate data. see? the web page is an app, right?

btw - if the web page is really just some html and scripts to get data, then that web page can be safely cached, too, right?

hmm.....

added support for timeouts on ajax calls

i've been using a very nice xmlhttp 'starter' library from the ajax patterns site. while it's a good library, it's not got all the bells and whistles. one item missing is handling time outs for long-running calls from client to server.

so, keeping it 'in the family' i decided to employ another pattern from the same site. now my library has a solid in-built timeout pattern as well as the ability to register a custom function to handle the time out event.

along the way i decided to adopt the rule of *not* using the .abort() method of the XMLHttpRequest object. while i see some strong debate on this matter, my testing has shown MSIE and FF behave differently when abort() is used. this, alone, along with tests that show invoking abort is not necessary, convinced me to drop it from the library.

Saturday, April 14, 2007

add support for basic auth today

i added the underlying support for basic-auth to the rest-based server system today.  i added session services. it's a simple cookie-based service, but it will make things easy to deal with. i plan on adding support for digest auth tomorrow. that would round things out nicely. 

i still need to implement a user store and a role-based security model. while this will not be super simple, getting the auth-right is the first step.  i also need to establish a uri table. this can hold the auth and role details for any/all uris as well as details on transformations and caching.  i have the basic ideas, but still need to come up with the details.

generally, it's going well, tho.  in fact, i am working on another side-project this weekend and find that my mind is now already thinking rest-like. this current project is typical RPC/API style and i'm having a bit of trouble keeping on track with the project's details.

i think that's a good sign, eh?

Friday, April 13, 2007

REST, HTTP, ACL

reviewing a couple documents, including the W3C docs on their ACL implementation points out that ACLs are applied to resource URIs - makes sense.

the W3C model has a database that lists URIs and their ACL grant. the work of matching the identity->role->grant details is done on the data side - nothing too magical, but would require a bit of 'busting' to get it right, i suspect (including granting for a single resource URI, for a resource URI collection [folder], inheritance downward, etc.).

i am still struggling with the issue of composers. not just the example of an xhtml home page, but also a simple list of resources. if a URI GET results in a list, is that list composed of only resources available to that identity/role? if that's true, does the same URI GET result in a different list for another identity/role?

[sigh] i think i'm missing something...

Thursday, April 12, 2007

groking REST

when designing a cache-able REST-ful implementation, two things are getting in my way:

- composers

- role-based security

i plan on tackling the composer issue first.

composers

composers are resources (usually xhtml pages) that act as mini-aggregators of the underlying resources. for example, the home page of a blog site probably has the most recent two or three posts in full; a list of the last ten posts (in title/link form), a list of the site's blogroll and maybe other data. this resource is a composed document made up of other resources. how do i make sure that the home page is replayable, reliably ache-able, and easily updated as the underlying data changes?

serer-side composers

there are two basic ways to handle the composing pattern. the first is to complete the composition at the server and then deliver the resulting cache-able document to the client. this works fine, but updating the content of this cached document now gets a bit tricky. to make the cache as up-to-date as possible, any change to the underlying resources (blog posts, lists of posts, blog roll, etc.) needs to invalidate the home page resource cache.

client-side composers

the second method is to complete the composition on the client - usually via ajax-type calls. the advantage here is that the composer itself is fully cache-able and will alway honor the cache status of the all the underlying resources. the downside is that it assumes much more 'smarts' on the client (will this model work on cell phones? other devices?).

it is certainly possible to create cache dependencies for the home page that will honor changes to the underlying resources. but now this creates a new set of management routines to handle not just the caching, but also the compose dependencies.  another approach is to simply make the composer resources cache for a fixed time (5 minutes, etc.). this simplifies the cache management, but risks clients getting 'stale' home page documents.

my first approach will be to use the client-side composer model. this allows me to focus on keeping the underlying resource implementaion clean and cacheable for now. once i get the hang of that, i can focus on creating a workable server-side composer model that can be cached efficiently, too.

Sunday, January 7, 2007

welcome

new things will go here