Subscribe to Dr. Macro's XML Rants

NOTE TO TOOL OWNERS: In this blog I will occasionally make statements about products that you will take exception to. My intent is to always be factual and accurate. If I have made a statement that you consider to be incorrect or innaccurate, please bring it to my attention and, once I have verified my error, I will post the appropriate correction.

And before you get too exercised, please read the post, date 9 Feb 2006, titled "All Tools Suck".

Sunday, February 20, 2011

Physical Improvement for Geeks: The Four Hour Body

I've just read through all of Tim Ferriss' The Four Hour Body (http://fourhourbody.com/) (4HB). Short version of review: found it really interesting and helpful and generally to be full of sound advice and guidance provided with a dose of humor. I am starting on the book's Slow Carb Diet (SCD) in an attempt to lose 20lbs of mostly visceral fat (read "lose my beer gut" and try to live to see my daughter graduate from college).

The book is written from a geek's perspective for geeks. It essentially takes an engineering approach to body tuning based on self experimentation, measurement, and application of sound scientific principles. In a post on the 4HB blog Tim captures the basic approach and purpose of the book:

"To reiterate: The entire goal of 4HB is to make you a self-sufficient self-experimenter within safe boundaries. Track yourself, follow the rules, and track the changes if you break or bend the rules. Simple as that. That’s what I did to arrive at my conclusions, and that’s what you will do — with a huge head start with the 4HB — to arrive at yours."

I've done Atkins in the past with some success so I know that for me a general low-carb approach will work. The Slow Carb Diet essentially takes Atkins and reduces it to the essential aspects that create change. The biggest difference between Atkins and the SCD is the SCD eliminates all dairy because of its contribution to insulin spiking despite a low glycemic index. So no cheese or sugar-free ice cream (which we got really good at making back in our Atkins days). The SCD also includes a weekly "cheat day" where you eat whatever crap you want, as much as you can choke down. After 6 days I've lost 3.5 lbs, which is about what I would expect at the start of a strict low-carb diet. I haven't had the same degree of mind alteration that I got from the Atkins induction process, which is nice, because that was always a pretty rough week for everybody.

What I found interesting about the 4HB was that Tim is simply presenting his findings and saying "this worked, this didn't, here's why we think this did or didn't work." He's not selling a system or pushing supplements or trying to sell videos. His constant point is "don't take my word for it, test it yourself. I might be spouting bullsh*t so test, test, test."

As an engineer that definitely resonated with me. He also spends a lot of time explaining why professional research is often useless, flawed, biased, or otherwise simply not helpful, if not downright counterproductive. As somebody who's always testing assumptions and asking for proof I liked that too.

He even has an appendix where he presents some data gathered from people who used the SCD, which, as presented suggested some interesting findings and made the diet look remarkably effective. He then goes through the numbers and shows why the numbers are deceptive and can't be trusted in a number of ways. If his intent was to sell the diet he would have just presented the numbers. Nice.

His focus is as much on the mental process as on the physical process: measure, evaluate, question, in short, think about what you're doing and why. Control variables as much as possible in your experiments.

I highly recommend the book for anyone who's thinking about trying to lose weight or improve their physical performance in whatever way they need to--Ferriss pretty much covers all bases, from simple weight and fat loss to gaining muscle, improving strength, etc.

He has two chapters focused on sexual improvements, one on female orgasm and one on raising testosterone levels, sperm count, and general libido in males. These could have come off as pretty salacious and "look what at what a sex machine I've become" but I didn't read them that way. Rather his point was that improving the sexual aspects of ones life is important to becoming a more complete person--it's an important part of being human so why not enjoy it to its fullest? I personally went through a male fertility issue when my wife and I tried to start a family and if I'd had the chapter on improving male fertility at that time (and if my fertility had actually been relevant) it would have been a godsend. One easy takeaway from that chapter: if you want kids don't carry an active cell phone in your pocket.

An interesting chapter on sleep: how to get better sleep, how to need less sleep, etc. Some interesting and intriguing stuff there as well. Some simple actions that might make significant positive changes in sleep patterns, as well as a technique for getting by on very little sleep if you can maintain a freaky-hard nap schedule.

Overall I found the book thoughtful, clearly written, engaging and entertaining and generally helpful. I found very few things that made me go "yeah right" or "oh please" or any of the reactions I often have to self help books. He stresses being careful and responsible and having a clear undestanding of what your goal is. In short, sound engineering practice applied to your physical self.

Dr. Macro says check it out.

Labels:

Saturday, February 19, 2011

Chevy Volt Adventure: Feb Diagnostic Report

Just got the February vehicle diagnostic report email from the Volt. I'm not sure why I find it so cool that my car can send me email, but I do.

The salient numbers are:

35 kW-hr/100 miles

1 Gallon of gasoline used. [This is actually an overstatement as we have only used 0.2 gallons since returning from our Houston trip at the end of December.]

Our electricty usage for January (the latest numbers I have) was (numbers in parens are for Jan 2010):

Total kW-hr: 954 (749)
Grid kW-hr: 723 (455)
Solar kW-hr: 231 (294)
Dollars billed: $58.37 ($35.12)

$/kWh used: $0.06 ($59.00/954)

kWh/mile: 0.35 (35kWh/100miles)

$/mile: $0.02

Our bill for Dec was $32.00, so we spent an extra $26.00 on electricity in January, some of which can be attributed to the unusually cold winter we've been having. We also produced about 60kWh less this January than last.

But if we assume that most of the difference was the Volt, that means it cost us about $20.00 to drive the vehicle for the month. We used essentially no gasoline so the electricity cost was our total operating cost.

Looking at the numbers it also means that the draw from the car is less than or roughly equal to the solar we produced over the same period. Not that much of that solar went to actually charging the Volt since we tend to charge later in the day or over night after having done stuff during the day, but if Austin Energy actually gave us market rates for our produced electricity rather than the steep discount they do give us, we could truthfully say we have a solar powered car, even in January. For contrast, our maximum solar production last year was 481 kWh in August, with numbers around 400 kWh most months.

Compare this cost with a gasoline vehicle getting 30 mpg around town at $3.00/gallon (current price here in Austin):

30 miles/gallon = 0.03 gallons/mile * $3.00/gallon =

$/mile: 0.09

However, our other car, a 2005 Toyota Solar only gets about 22 mpg around town, which comes out to

$/mile: 0.15

Of course these numbers only reflect direct operating cost, not the cost of our PV system or the extra cost of the Volt itself relative to a comparable gas-powered vehicle, but that's not the point is it? Because it's not just lowered operating cost but being a zero-emissions vehicle most days and using (or potentially using) more sustainable sources of energy.

But another interesting implication here is what would happen (or will happen) when the majority of vehicles are electric? If our use is typical, it means about a 25% increase in electricity consumption just for transportation. What does that mean for the electricity infrastructure? Would we be able in the U.S. to add 25% more capacity in say 10 years without resorting to coal? How much of that increase can be met through conservation? It seems like it could be a serious challenge for the already-straining grid infrastructure, something we know we need to address simply to make wind practical (because of the current nature of the U.S. grid).

If Chevy and the other EV manufacturers can bring the cost down, which they inevitably will, people are going to flock to these cars because they're fun to drive, cheaper to operate, and better for the air. Given the expected rate of advance in battery technology and the normal economies of scale, it seems reasonable to expect the cost of electric vehicles to be comparable to gasoline vehicles in about 5 years. If gas prices rise even $1.00/gallon in that time, which seems like a pretty safe bet (but then I would have expect gas to be at $5.00/gallon by now after it's spike back in 2008), then the attractiveness of electric vehicles will be even greater.

Which is all to say that I fully expect EVs like the Volt to catch on in a big way in about 5 years, which I think could spell, if not disaster, then at least serious strain in the U.S. electricity infrastructure. I know the City of Austin is thinking about it because that's their motivation for paying for our charging station: monitor the draw from the car so they can plan appropriately. But are we doing that a national level? I have no idea, but history does not instill confidence, let us say.

Labels:

Wednesday, January 19, 2011

Chevy Volt Adventure: Fun to Drive

We've been driving the Volt around town now for a few weeks and the biggest surprise to me is how much fun it is to drive. The instant acceleration, freaky smoothness, and weight-enhanced handling make it a lot of fun to drive. You can zip around, corner hard, and do it all without fuss or noise. And we haven't even tried sport mode yet.

As for the car itself, it seems to be holding up well--I haven't noticed anything particularly tinny or annoying, with the possible exception of the charge port cover, which seems a little weak but then it's just a little cover, but the latch is a little less aggressive than I'd like--a couple of times I've thought I pushed it closed but it hadn't caught.

We are clearly not driving in the most efficient manner because our full-charge electric range is currently estimated at about 30 miles, which our Volt Assistant at GM assures us reflects our profligate driving style and not an issue with reduced battery capacity.

As a family car it's working fine. With our around-town driving we've only had to use a fraction of a gallon of gas when we've forgotten to plug in after a trip. So our lifetime gas usage total is about 8.6 gallons, of which 8.5 were used on the round trip to Houston.

Labels:

Tuesday, January 04, 2011

Chevy Volt Adventure: Houston Trip 1

On Christmas Eve we loaded up the Volt and headed to Grandma's house in Houston.
IMG_0948
The picture shows the cargo area loaded for the trip. The cargo space is a little cramped but was able to accomodate what we needed for this trip, including all the gifts. It would be hard pressed to hold three full-sized rollaboards.

In the car we had me, my wife, our daughter, and our dog, Humphrey (a basset hound). Everyone was comfortable but this is definitely a 4-passenger vehicle because of the bucket seats in back. The seats were reasonably comfortable for a 3-hour trip, comparable to what I'm used to from our other car, a 2005 Toyota Solara convertible.

The total round trip from our house to Grandma's house is about 450 miles. The trip meter reports we used 8.1 gallons for a trip MPG of about 51, which is pretty good.

In our Solara, which averages about 22 MPG overall and gets probably 30 or so on the highway, we usually fill up at the halfway point out and back, using a full 15-gallon tank over the course of the trip. On this trip we didn't stop to fill up until the return, when the tank showed 3/4 empty. I put in about 6 gallons but I think the tank didn't fill (it was the first time I'd put gas in so I had no idea how much to expect to need—the tank must be 10 gallons if 3/4 reflected an 8-gallon deficit).

On the way out the battery lasted from Austin to just outside Bastrop, about 30 miles. It's clear that, as expected, highway speeds are less efficient than around-town speeds. I'd be interested to know what the efficiency curve is: is it more or less linear or, more likely, curves sharply up above say 50 MPH. My intuition says 40 MPH is the sweet spot. I tried to keep it between 60 and 70 for most of the trip (the posted limit for most of the trip is 70). I drove a little faster on the way home having realized that it didn't make much difference in efficiency.

Highway driving was fine. The car is heavy for its size, with the batteries distributed along the main axis, which makes it handle more like a big car than the compact it is. Highway 71 is pretty rough in places but the car was reasonably quiet at 70. When we left I-10 in Houston there was enough accumulated charge to use the battery for the couple of miles to my mother-in-law's house.

It definitely has power to spare and plenty of oomph. There's no hesitation when you stamp the accelerator and I had no problem going from 45 to 65 almost instantly to get from behind a slow car on I-10. We have yet to try the "sport" driving mode but now I'm almost afraid to.

The car is really smooth to drive--like driving an electric golf cart in the way it just smoothly takes off and doesn't make any noise.

If we had a problem it was the underbuilt electrical circuit at Grandma's that served the garage—at one point when we had the car plugged in and charging the circuit breaker flipped (a 15-amp circuit)—turned out the circuit also served most of the kitchen, where we were busy preparing Christmas dinner.

If there is any practical issue with the vehicle it's the climate control—it takes a lot of energy to heat it. Houston was having a cold snap so we got to test the heating system. The multi-position seat heaters are nice but keeping the controls on the "econ" setting meant that backseat passengers sometimes got a little chilled. You do realize how much waste heat gas engines produce when you don't have it available to turn your car into a sauna.

It was also weird to get back from a drive and realize that the hood is still cold.

We spent the last week traveling in the Northwest and rented the cheapest car Enterprise offers, which turned out to be a Nissan Versa, a tinny little econobox. The contrast was dramatic and made me appreciate the Volt. The two vehicles are comparable in size and capacity (but not cost, of course), but the Versa had a hard time making it up to highway speed and sounded like the engine might come out or explode under stress or blow off the road in a stiff breeze.

Now that we're back to our normal workaday life we'll see how it does in our normal around-town driving, but my expectation is that we'll use very little, if any, gas as we seldom need to go more than 10 miles from home (our longest usual trip is up north to Fry's, which is about a 20-mile round trip). We'll probably take it out to Llano and Lockhart for BBQ if we get a warm weekend in the next month or so.

On the way back from Houston we ended up near a Prius and ran into them at the gas station. They were interested in how the Volt was working and we got to compare MPG and generally be smug together. I ended up following them the rest of the way into Austin, figuring they probably reflected an appropriately efficient speed.

And I'm still getting a kick out of plugging it in whenever I bring it back home.

Labels:

Tuesday, December 21, 2010

Chevy Volt Adventure

My family is now the second (in Texas or Austin, not 100% sure) to take delivery of a 2011 Chevy Volt. We got it last night and it's sitting in the carport happily charged.

The car is very cool, very high tech. It sends you status emails. It chides you for jackrabbit starts (although I gather other electric and hybrid vehicles do as well).

It is freaky quiet in electric mode, a bit rumbly in extended mode.

The interior is pretty nice, reasonably well laid out, nicely detailed. The back seat is reasonably comfortable (I have the torso of a 6-foot person and my head cleared the back window).

Accelerates snappily in normal driving mode (haven't had a chance to try the "sport" mode yet). Handles pretty nicely (the batteries are stored along the center length of the vehicle, giving it pretty good balance).

We'll be driving it to Houston, about 500 miles round trip, in a couple of days. I'll report our experience.

Early adopters get some perks. We get 5 years of free OnStar service. We get a free 240v charging station from the City of Austin at the cost of letting them monitor the energy usage of the charger. We get a special parking space at the new branch library near us. The Whole Foods flagship store has charging stations--might actually motivate me to shop there (we normally avoid that Whole Foods because it's really hard to park and you know, it's Whole Foods).

One thing that will take some getting used to is not having to put a key into it in order to operate it. I kept reflexively reaching toward the steering column to remove the key that wasn't there.

Here's a question for you Electrical Engineers out there: what is the equivalent to miles per gallon for an electric vehicle? Is it miles per megajoule? miles per amp-hour?

I'm trying to remember what the unit of potential electrical energy is and coming up blank (not sure I ever really knew).

Oh, and since we have a PV system on the house and can control when charging takes place, I am going to claim that this Volt is a solar powered vehicle.

Labels:

Wednesday, September 01, 2010

Norm Reconsiders DITA Specialization

Norm Walsh has published a very interesting post to his blog, Reconsidering specialization, part the first.

This is very significant and I eagerly await Norm's thoughts.

As Norm relates in his post, he and I had what I thought was a very productive discussion about specialization and what it could mean in a DocBook context. I think Norm characterized my position accurately, namely that the essential difference between DocBook and DITA is specialization and that makes DITA better.

Here by "better" I mean "better value for the type of applications to which DITA and DocBook are applied". It's a better value because:

1. Specialization enables blind interchange, which I think is very important, if not of utmost importance, even if that interchange is only with your future self.

2. Specialization lowers the cost of implementing new markup vocabularies (that is, custom markup for a specific use community) roughly an order of magnitude easier.

There's more to it than that, of course, but that's the key bits.

All the other aspects of DITA that people see as distinguishing: modularity, maps, conref, etc., could all be replicated in DocBook.

If we assume that DITA's more sophisticated features like maps and keyref and so forth are no more complicated than they need to be to meet requirements, then the best that DocBook could do is implement the exact equivalent of those features, which is fine. So to that degree, DocBook and DITA are (or could be) functionally equivalent in terms of specific markup features. (But note that any statement to the effect that "DITA's features are too complicated" reflects a lack of understanding of the requirements are that DITA satisfies--I can assure you that there is no aspect of DITA that is not used and depended on by at least one significant user community. That is, any attempt, for example, to add a map-like facility to DocBook that does not reflect all the functional aspects of DITA maps will simply fail to satisfy the requirements of a significant set of potential users.)

But note that currently DocBook and DITA are *not* functionally equivalent: DocBook lacks a number of important features needed to support modularity and reuse. But I don't consider that important. What really matters is specialization.

Note also that I'm not necessarily suggesting that DocBook adapt the DITA specialization mechanism exactly as it's formulated in DITA. I'm suggesting that DocBook needs the functional equivalent of DITA's specialization facility.

Note also that DocBook as currently formulated at a content model level probably cannot be made to satisfy the constraints specialization requires in terms of consistency of structural patterns along a specialization hierarchy and probably lacks a number of content model options that you'd want to have in order to support reasonable specializations from a given base.

But those are design problems that could be fixed in a DocBook V6 or something if it was important or useful to do so.

Finally, note that in DITA 2.0 there is the expectation that the specialization facility will be reengineered from scratch. That would be the ideal opportunity to work jointly to develop a specialization mechanism that satisfied requirements beyond those specifically brought by DITA. In particular, any new mechanism needs to play well with namespaces, which the current DITA mechanism does not (but note that it was designed before namespaces were standardized).

Monday, August 09, 2010

Worse is Better, or Is It?

At the just-concluded Balisage conference, Michael Sperberg-McQueen brought up the (apparently) famous "worse is better" essay by Richard P. Gabriel (Wikipedia entry here, original paper here). I had never heard of this (or at least had no memory of ever hearing of it) even though it is directly relevant to my experiences as a standard developer and engineer, where I've done things in both the "MIT" way (correctness is most important) and, more or less, the "New Jersey" way (simplicity is most important). I was actually very surprised that nobody had ever pointed me to it before.

Gabriel's original argument is essentially that software that chooses simplicity over correctness and completeness has better survivability for a number of reasons, and cites as a prime example Unix and C, which spread precisely because they were simple (and thus easy to port) in spite of being neither complete functionally nor consistent in terms of their interfaces (user or programming). Gabriel then goes on, over the years, to argue against his own original assertion that worse is better and essentially falls into a state of oscillation between "yes it is" and "no it isn't" (see his history of his thought here).

The concept of "worse is better" certainly resonated with me because I have, for most of my career, fought against it at every turn, insisting on correctness and completeness as the primary concerns. This is in some part because of my work in standards, where correctness is of course important, and in part because I'm inherently an idealist by inclination, and in part because I grew up in IBM in the 80's when a company like IBM could still afford the time and cost of correctness over simplicity (or thought it could).

XML largely broke me of that. I was very humbled by XML and the general "80% is good enough" approach of the W3C and the Web in general. It took me a long time to get over my anger at the fact that they were right because I didn't want to live in that world, a world where <a href/> was the height of hyperlinking sophistication.

I got over it.

Around 1999 I started working as part of a pure Extreme Programming team implementing a content management system based on a simple but powerful abstract model (the SnapCM model I've posted about here in the past) and implemented using iterative, requirements-driven processes. We were very successful, in that we implemented exactly what we wanted to, in a timely fashion and with all the performance characteristics we needed, and without sacrificing any essential aspects of the design for the sake of simplicity of implementation or any other form of expediency.

That experience convinced me that agile methods, as typified by Extreme Programming, are very effective, if not the most effective engineering approach. But it also taught me the value of good abstract models, that they ensure consistency of purpose and implementation and allow you to have both simplicity of implementation and consistency of interface, that one need not be sacrificed for the other if you can do a bit of advanced planning (but not too much--that's another lesson of agile methods).

Thinking then about "worse is better" and Gabriel's inability to decide conclusively if it is actually better got me to thinking and the conclusion I came to is that the reason Gabriel can't decide is because both sides of his dichotomy are in fact wrong.

Extreme Programming says "start with the simplest thing that could possibly work" (italics mine). This is not the same as saying "simplicity trumps correctness", it just says "start simple". You then iterate until your tests pass. The tests reflect documented and verified user requirements.

The "worse is better" approach as defined by Gabriel is similar in that it also involves iteration but it largely ignores requirements. That is, in the New Jersey approach, "finished" is defined by the implementors with no obvious reference to any objective test of whether they are in fact finished.

At the same time, the MIT approach falls into the trap that agile methods are designed explicitly to avoid, namely overplanning and implementation of features that may never be used.

That is, it is easy, as an engineer or analyst who has thought deeply about a particular problem domain, to think of all the things that could be needed or useful and then design a system that will provide them, and then proceed to implement it. In this model, "done" is defined by "all aspects of the previously-specified design are implemented", again with no direct reference to actual validated requirements (except to the degree the designer asserts her authority that her analysis is correct). [The HyTime standard is an example of this approach to system design. I am proud of HyTime as an exercise in design that is mathematically complete and correct with respect to its problem domain. I am not proud of it as an example of survivable design. The fact that the existence of XML and the rise of the Web largely made HyTime irrelevant does not bother me particularly because I see now that it could never have survived. It was a dinosaur: well-adapted to its original environment, large and powerful and completely ill adapted to a rapidly changing environment. I learned and moved on. I am gratified only to the degree that no new hyperlinking standard, with the possible exception of DITA 1.2+, has come anywhere close to providing the needed level of standardization of hyperlinking that HyTime provided. It's a hard problem, one where the minimum level of simplicity needed to satisfy base requirements is still dauntingly challenging.]

Thus both the MIT and New Jersey approaches ultimately fail because they are not directly requirements driven in the way that agile methods are and must be.

Or put another way, the MIT approach reflects the failure of overplanning and the New Jersey approach reflects the failure of underplanning.

Agile methods, as typified by Extreme Programming, attempt to solve the problem by doing just the right amount of planning, and no more, and that planning is primarily a function of requirements gathering and validation in the support of iteration.

To that degree, agile engineering is much closer to the worse is better approach, in that it necessarily prefers simplicity over completeness and it tends, by its start-small-and-iterate approach, to produce smaller solutions faster than a planning-heavy approach will.

Because of the way projects tend to go, where budgets get exhausted or users get bogged down in just getting the usual stuff done or technology or the business changes in the meantime, it often happens that more sophisticated or future-looking requirements never get implemented because the project simply never gets that far. This has the effect of making agile projects look, after the fact, very much like worse-is-better projects simply because informed observers can see obvious features that haven't been implemented. Without knowing the project history you can't tell if the feature holes are there because the implementors refused to implement them on the grounds of preserving simplicity or because they simply fell off the bottom of the last iteration plan.

Whether an agile project ends with a greater degree of consistency in interface is entirely a function of engineering quality but it is at least the case that agile projects need not sacrifice consistency as long as the appropriate amount of planning was done, and in particular, a solid, universally-understood data or system model was defined as part of the initial implementation activity.

At the time Unix was implemented the practice of software and data modeling was still nascent at best and implementation was hard enough. Today we have deep established practice of software models, we have well-established design patterns, we have useful tools for capturing and publishing designs, so there is no excuse for not having one for any non-trivial project.

To that degree, I would hope that the "worse is more" engineering practice typified by Unix and C is a thing of the past. We now have enough counterexamples of good design with simplest-possible implementation and very consistent interfaces (Python, Groovy, Java, XSLT, and XQuery all come to mind, although I'm sure there are many many more).

But Michael's purpose in presenting worse-is-better was primarily as it relates to standards and I think the point is still well taken--standards have value only to the degree they are adopted, that is to the degree they survive in the Darwinian sense. Worse is more definitely tells us that simplicity is a powerful survival characteristic--we saw that with XML relative to SGML and with XSLT relative to DSSSL. Of course, it is not the only survival characteristic and is not sufficient, by itself, to ensure survival. But it's a very important one.

As somebody involved in the DITA standard development, I certainly take it to heart.

My thanks to Michael for helping me to think again about the value of simplicity.

Thursday, July 22, 2010

At Least I Can Walk To Work, Part 2

OK, I may have overreacted before. I was really pretty depressed there for a couple of days. I even started seriously considering buying a gun to have in the house "just in case". I think I've calmed down a bit. For one thing, I couldn't stay in that state without going literally mad.

As a counter to the depressing doomsaying of James Kunstler I found Peak Oil Debunked, which seems to be a reasonably thoughtful counter to the most extreme of the Peak Oil predictions. It cheered me up a bit, although I found a number of the arguments therein to be not entirely convincing or accurate-but-missing-the-point, especially as regards agriculture. Peak Oil Debunked (POD) isn't actually debunking the notion of peak, just trying to counter the most extreme doomsday prognosticating, which is good. POD isn't saying there is no peak, just that the results of it can't be as extreme as Kunstler and other Peak Oil doomsayers are predicting.

But it's hard to see how a serious contraction of resources, especially food (and by extension, capital for investment), isn't inevitable in the relatively short term. That can't play out well. Our current recession has to be a taste of what's to come--there just isn't going to be the level of energy input needed to pull us out the way WWII did for the Great Depression, so even if the contraction is slow it will still be a contraction and that will be hard on everyone who is even a little overextended or otherwise dependent on continued growth, which may be all of us, even those of us who have eliminated our debt and have a little cash set aside.

We're starting to see practical electric vehicles coming on line, but does that help if it keeps us from replacing the suburbs with more dense towns and villages? If we aren't building trains at the same time we're building wind farms, I fear we're missing the point. Why can't I take a 300kph train from Austin to Houston or Austin to Dallas?

On the other hand, it's going to be a long time before we can electrify air travel, barring some unexpected miracle in electricity storage density and planes can't run on coal, so I think we're not far away from seeing air travel severely curtailed. The answer to my train question is "because I can fly Austin to Dallas on SWA for $100.00"--what motivation does any private enterprise have in building that train and what motivation does the State of Texas have, given it's millions in the hole just now? None. But I think that economic picture has to change soon (within the next decade).

But in many ways Kunstler's arguments are about financial chaos as a side effect of inevitable contraction and that seems much scarier and more likely than simply running out of fuel for cars. We're already in a situation where credit is hard to get. It won't matter how many electric cars GM or Toyota builds if nobody can get a loan to buy them.

So I don't know. It seems likely that market forces will tend to reduce consumption as prices increase, mitigating at least the immediate effects of fall-off in supplies. As the Peak Oil Debunked blog points out, there is a lot of room for conservation in the U.S. For myself, I could almost entirely eliminate the use of my car for things like getting groceries, the pharmacy, going out to eat, as long as I was willing to eat the time cost. We live within walking distance of all the essential services we need. But Austin, like most U.S. cities, doesn't provide the level of public transport needed to make more far-flung trips convenient, must less pleasant. I could shave a couple of kWhs a day from my electricity use if I really had to. But my house is already very energy efficient so it would be about using less A/C and putting all the wall warts on switches and that sort of thing, stuff for which we currently have no economic incentive in the face of the inconvenience and discomfort. I'd rather just invest in more solar panels as long as I have the cash to do so.

Let's say oil supplies tighten significantly over the next two years and bus ridership goes up--Austin's transit agency is already in serious budget trouble and would have a hard time reacting to a surge in ridership (as they did in 2008). While we finally have a (largely pointless) light rail system, it would take a remarkable effort to put in a more comprehensive trolly or Portland-style light rail system in less than 5 years given public demand for it.

[I say our light rail system is pointless because it essentially serves one suburb of Austin, making it convenient for people who work downtown and live in Cedar Park to get to work. The train doesn't go anywhere else interesting and doesn't usefully serve anyone south of down town. Ridership has been a fraction of projections and of capacity. Small surprise. Now if there was a train that went from downtown to the airport and that came south to at least Ben White on Congress that would help a lot. I know of no plans along those lines.]

So is there anything actionable out of this new-found appreciation for peak oil and the inevitable contraction in our economy and life styles? Thanks to a recent inheritance I have some cash available. Should I buy a Volt? Expand my PV system? Buy gold? Put by a year's worth of rice and canned goods? Buy Treasury bonds? Fill my garage with machine tools? Buy a shotgun?

For now I think I'm going to take the following actions:

1. Put my name down on the list for the Chevy Volt. Austin is one of four launch cities.
2. Travel with my family as much as we can before air travel becomes a thing of the past for all but the richest humans.
3. Put in the rain water cistern we had to cut from our original house construction project (the rain capture plumbing is in place).
4. Think seriously about expanding the PV system, although I'm hesitant to do so too quickly as new PV technology is developing rapidly.
5. Avoid taking on any new debt--we are currently free of consumer debt and I'd like to keep it that way.
6. Continue to reduce our expectations, as a family, of what "enough" means, and try to teach my daughter that things are not what life is about.

I'm going to keep tracking opinion and bloviation in the Peak Oil space--it's entertaining if nothing else.

More as it develops....

Monday, July 19, 2010

Good Thing I Can Walk to Work

Given some of my colleagues in the XML community, I feel like I might be coming a little late to this party, but I just read James Howard Kunstler's The Long Emergency, a cogent and calm analysis of peak oil and an exploration of what that is likely to mean for the world as a whole and the U.S. in particular.

Kunstler's point is basically this: 100 years of cheap oil have allowed us to create a society of artificially inflated wealth, allowing us to overpopulate the Earth far beyond its normal carrying capacity (literally "eating oil" in the form of crops grown with artificial fertilizers, pesticides, and irrigated using water pumped by cheap oil), live in unsustainable suburbs and skyscrapers, and generally live far beyond our means. And this period of cheap oil is about to (and now, 6 years after the book was published, has) start ending, as we pass the "peak oil" point, the point at which the world supply of oil steadily decreases.

His prediction is that the decrease in supply will inevitably lead to a number of very serious problems, of which the most dire will simply be a lack of food--without cheap oil to irrigate, fertilize, and transport food people will starve. Lots of people. He makes the point that the world population at the start of the industrial revolution was about 1 billion people, which we can take to be the maximum solar carrying capacity of the earth. We're roughly 6 times that now. Not good. This can only lead to resource conflicts of the most serious kind.

Thus we are entering a time of contraction on all fronts: food supplies, energy for transportation, fuel for heating and industry, feedstocks for chemicals and plastics, etc. The reduction in real wealth will make it harder to maintain our current infrastructure and even build those things that might replace some of the lost oil (how do you raise the money to build wind farms or solar panel fabs when the growth-based economy has collapsed and the cost of everything is rising?).

In short, the 100-year period of constant growth in wealth is over, never to return. The disruptions will be severe and hard to predict.

All of this will be exacerbated by global warming (itself caused by coal and oil), which will further disrupt agriculture not to mention flooding coastal cities and, in the worst case, shutting down the gulf stream and freezing Europe.

Kunstler's arguments seem to be well grounded in fact and a clear historical context. Many of his facts were consistent with what I've read in other contexts. I didn't check his primary sources but he at least documented his key facts. He's not hysterical and does not appear to have any particular ideological axe to grind, other than clearly viewing all politicians as spineless and useless.

One part of the book that I found particularly striking was his remarkably accurate prediction of the financial meltdown that occurred in 2008.

He also makes the point that no alternative energy source is going to do much to help, certainly not in the short term (meaning the next few decades) for the simple reason that, even if we could generate all the electricity we needed with solar, wind, and nuclear, we can't replace our oil-based transportation system with an electricity-based one. Not to mention it's unlikely the U.S. would be willing or able to build enough nuclear plants fast enough. Hydrogen is patently nonsense and ethanol is a scam. If start rebuilding the train network now, we might avoid the worst, but we're not seeing any political movement in that direction (although Warren Buffet's investment in Union Pacific Santa Fe seems much more shrewd than it may have seemed at the time).

If he's anywhere close to right (and I think he is, or I wouldn't be mentioning it here), then we're in for some pretty serious disruptions in all aspects of society.

For information technology, one question becomes: where does the electricity come from to power the servers that we now depend on for our Google and Amazon and Bing? Not to mention the manufacturing facilities to build the computers themselves. Chip fabs cannot be scaled down to cottage industries.

I find it interesting that the trend in large-scale server farms has been to cite them in the Pacific Northwest where hydro power is plentiful. That could be to our advantage. Will the economy remain sufficiently intact to even have a need for the sort of abstract computing infrastructure we build and maintain? Will soaring transportation costs make computing that much more important?

I just returned from a week-long trip to Oxford, UK, which I found to be precisely the sort of small city Kunstler says we'll all have to live in before too long: dense, walkable, with good public transport, not too vertical. I walked and took the bus to get from my hotel to my client about 2 miles away. I ate in neighborhood restaurants, all quite good, no more than a few minutes' walk or a short bus ride away. Several of my colleagues did not own cars and several biked to work every day. The cars on the streets were significantly smaller, on average, than I'm used to here in the States.

When I got home to Austin I was struck by the almost cartoonish hugeness of the vehicles in the airport parking garage--they seemed to be almost exclusively the hugest SUV models made. Even when we drive our 15-year-old Explorer, which was the largest SUV Ford made at the time we bought it, we often lose it behind bigger monsters parked around it. Our Toyota, which would have been a large car in Oxford, was really hidden.

How many more overseas trips can I expect to take in my job? I didn't do anything that couldn't have been done well enough remotely, although being there physically made things more efficient and had significant social benefit.

Another question that immediately came to my mind is what can I do to really prepare for the sort of almost unthinkable change that Kunstler says is coming? I've already done a good bit by moving closer to the city center, building an energy-efficient house that supplies some of it's own electricity (and could supply more given an investment in more PV panels or a small wind generator). We raise chickens and grow a few vegetables and could grow a good bit more. There's land in the neighborhood on which a community garden could be based (could probably feed a good bit of the neighborhood with the open (and largely unused) areas on elementary school grounds just a couple of blocks away).

But what about natural gas? Kunstler points out something I hadn't known, which is that when natural gas wells tap out, they do so very quickly and without warning. And he claims most of our gas fields are already starting to play out. There's not much you can replace natural gas with. Could I build a digester and produce enough methane from chicken poop? Where would I get the grain to feed the chickens to make the poop? Right now we get organic grain from a mill here in Texas, but before that mill opened, we got it from a mill somewhere quite far away (Kansas? Pennsylvania? I don't remember now). Seems unlikely I could produce enough to keep my on-demand water heater running, much less run the stove or the furnace.

Here in Austin we're reasonably well situated--we have relatively warm winters, good solar exposure, water from lakes (not the Ogalalla aquifer), decent agricultural land, and less sprawl than most other cities in Texas. But would that be enough? I don't know. Austin was the edge of the frontier when it was first settled in the 1830s. It could be again.

It would be hard for us to survive full summer heat without air conditioning, even with the passive solar aspects of our house, although our PV system was designed in part to cover the daytime load of the A/C system, so maybe we would be OK there. With an electric car we could get around as long as there was sun to charge it. And we can walk or bike to most of what we need. (Should I get on the list for a Volt--Austin will be a launch city. I'm seriously thinking about it since I do now have the cash to buy one.)

In any case, the book is a real eye-opener.

If there is any silver lining to all this it seems pretty clear that a lot of current issues like over-dependence on corn-based food, globalization's homogenizing and dislocating effects, over-sedentary, over-fed first-worlders, will go away pretty quick. But I can't seen those changes being necessarily pleasant or welcomed by the majority. I'm glad I'm handy with tools and subscribe to Make. I'm thinking that maybe having a milling machine and a metal lathe in the garage might be good moves.

My father grew up on a pear and apple ranch in the Hood River valley of Oregon. We visited there recently and I got to talk to the woman who now runs the ranch (the granddaughter of the man who founded the ranch back in the early 20's, just as oil-based agriculture was coming into its own). We talked about globalization of agriculture and the local food movement and such.

She pointed out that their ranch, which is small by Hood River valley standards, produces as many pears as the state of Oregon consumes in a year. Most of the pears they grow are sold overseas. If they didn't have access to those markets, where would they sell? It's hard to see that ranch (or the Hood River Valley or the Yakama Valley) being viable in their current forms for much longer.

So enjoy those pears and $3.00 a pound Yakama cherries while you can. I know I am.

Sunday, April 04, 2010

My Precious: iPad Day 1

I bought an iPad at 9:30 am CDT 3 April 2010. I had to.

My nominal justification was to see if it would work for my dad. It absolutely will.

But really I just couldn't not have one.

Here are my initial impressions:

-typing is remarkably efficient. I am writing this on the iPad sitting in a comfy chair, pad in my lap (put it on a pillow after a while so cord would reach--battery finally running down). I am a fast touch typist and I can sort of touch type but really it's just fast hunt and peck. But I don't feel like it's slowing me down. I do miss arrow keys--finding navigating around in a multiline edit field a bit tedious.

- the response is very fast. Web pages load fast, apps load fast. Not like an iPhone at all.

- web browsing has full computer feel. Have not yet gone to site that didn't seem to work in safari (flash sites excepted of course)

- so far everything has just worked, which is a lot of the point of all apple products

- the only potential issue so far has been the volume indicator wouldn't go away watching some YouTube videos but it hasn't recurred

- everyone who sees it wants one. Badly.

- netflix app worked well. Punched up Willy Wonka and it played very nicely

- The Elements interactive book is a pretty amazing demonstration of what the device can mean for instruction and reference.

- I downloaded all the free newspaper apps I could find and they all provided a very satisfying reading experience. One of e things i was looking for was that Dave-Bowman-reading-the-paper-on-his-tablet-over-breakfast experience and i think we have it. Will definitely consider a NYT subscription--we get the Sunday times and usually buy the Tuesday edition. So $4.00 a month for full access would be reasonable.

- iBooks seemed to work pretty well although I'd really like to be able to add my own epub books to it. Not sure if there's a way to do that in iTunes.

- upgraded to plants vs zombies HD and have been having a hard time dragging the device away from my daughter (age 6). She also likes the drawing apps.

- mail working pretty well, but not that different from iPhone experience except for more screen space and easier typing

- battery life seems as advertised ran all day on a charge including video, lots of PvZ playing, and weak wifi signals

It definitely meets the pick it up and carry it everywhere requirement, which raises several practical issues:

- will I ever be able to put it down? There's a serious danger of always having it to hand which means always reading something or playing a game or whatever.

- where will I set it down? We have concrete floors so you want to set it in a relatively safe place, of which we have few

- how do you keep it from being stolen?

So far I can say without reservations that it has exceeded my fairly high expectations after a day of use.

Cory Doctorow has made an eloquent and principled argument against the iPad as being a closed system that is counter to the basic concept of freedom and access the Internet represents. I agree with Cory in principal. I have spent my entire career championing standards specifically because they protect against proprietary control and lock in. Yet I have a MacBook and an iPhone and now an iPad and would not part with them. Why? Because they fricken work. They are solid and beautiful and reliable. Even though Cory is right it doesn't matter because there are very few of us who can trade reliability for openness.

If Google can build a software and hardware platform comparable to the iPad then I'm there. But so far not even Microsoft much less the open source world has succeeded in building a device (since WebTV) that I would put in father's hands. Even my mother, who is quite computer savvy, has just traded in her dell for a Mac.

At the same time, the content standards my clients depend on are all well supported: epub for ebooks, HTML for web delivery, PDF for page fidelity. Lack of flash is an annoyance but not a deal killer since nobody should be depending on flash exclusively anyway.

There is the question of whether the App Store as Apple manages it is draconian or a necessary evil in order to have a system safe for unsupervised use by children. I'm sure I can form a useful opinion without a lot more thought. I'm not one for censoring children's access to information in general once they are old enough to understand what they might be finding, but 6 is not yet that age.

Labels:

Saturday, January 09, 2010

Need a WebTV Replacement

Some years ago now I set my father up with WebTV. It met his needs perfectly: it gave him email and Web access from his TV (he spends most of his time in front of his TV), it was reliable, it didn't require him to learn how to use a computer generally, and it didn't require any support from me (my father is in Tacoma, Washington and I'm in Austin, Texas, so I can't just pop over to provide hands-on support).

My father is not tech savy--he used a manual typewriter to produce a club newsletter for years until the club finally forced him to upgrade to an electric typewriter. He refuses to carry a mobile phone or use ATMs. You get the idea. However, he depends on email and e-bay so he has to have some sort of Internet access.

Unfortunately, while Microsoft has not completely abandoned WebTV, they have not enhanced it in years and clearly have no intention of doing so--you can't even download the emulator they used to provide.

The problem for my father is that WebTV is simply no longer up to the task of supporting modern Web sites and it's becoming harder and harder for him to use e-bay and other Web sites, like Amazon or Flicker. And forget about Facebook.

My quandry is what to replace WebTV with. So far I haven't been able to identify any obvious good solutions. The Wii's Web browser is close but it's still pretty clunky--even with a keyboard I don't think it would be reliable or simple enough for my dad--it requires a lot of wimote fiddling to scroll and pan around Web sites that don't fit nicely on a screen.

AppleTV would seem likely except that it doesn't come out of the box with a Web browser and I'm not going to support a hack that adds one.

A Mac mini might serve, but that gets us into the having a full computer problem, and I'm not sure my dad's TV takes HDMI input (I need to find out about that).

It seems like the new tablets that are all the buzz of the gadget world might serve, especially the rumored Apple tablet, but I'm not sure my dad would be willing to drop a grand on it, and I'm not keen to have him be an early adopter.

But I feel like I'm missing some obvious technology choice. Anyone out there have any thoughts about how to provide a TV-connected Web browser that is easy to use, works reliably, and will work with modern Web sites?

PDF2 Transform Now Enabled for Plugin-Based Extension

In the latest 1.5 Toolkit distributions, the PDF2 transform has been enabled for plugin-based extension. This means that you can use normal plugin techniques to provide extensions to the PDF processing that support specializations or global overrides, rather than customizations for specific publication sets or book designs.

As originally implemented, the PDF2 processor could only be extended through its unique Customization facility, whereby you either add things to its built-in Customization directory or create copies of that directory and then specify where the Customization directory is as a parameter to the transform. This is appropriate for customizations that are not global, that is, they are specific to particular publications, sets of publications, products, or whatever.

It is not appropriate, however, for providing general extensions, such as support for new domains where the domain-specific processing would normally be the same in all outputs or where the base processing is the same but can be customized using the normal PDF2 customization facilities.

In the latest DITA 1.5 Toolkit, you can now have both plugin-provided extensions as well as Customization-based extensions. This makes it easy to provide generic PDF2 support for specializations or provide global overrides for existing topic and map types.

A PDF2-extending plugin can provide only overrides, or only a Customization directory or both.

For example, for DITA for Publishers, I've started implementing support for the Publication Map (pubmap) map domain, which is similar to bookmap but tailored for Publishers. To support the PDF2 transform, I've created a plugin that provides both general extensions and a base Customization directory that can be used as a basis for local customizations.

The directory structure of the plugin is:

net.dita4publishers.pubmap.fo/
Customization/
xsl/
plugin.xml

Where the Customization/ directory follows the rules and conventions for the PDF2 Customization directories and xsl/ holds the plugin-provided XSLTs that extend the base PDF2 processing.

The plugin.xml file looks like this:

<plugin id="net.sourceforge.dita4publishers.pubmap.fo">
<require plugin="net.sourceforge.dita4publishers.formatting-d.fo"/>
<require plugin="net.sourceforge.dita4publishers.pubContent-d.fo"/>
<require plugin="net.sourceforge.dita4publishers.xml-d.fo"/>
<feature extension="dita.xsl.xslfo"
value="xsl/pubmap2xslfo.xsl" type="file"/>
</plugin>

The elements are indicating dependencies on other PDF2-extending plugins for the different domains that DITA For Publishers provides.

The line is what integrates the XSLTs into the main PDF2 XSLT transforms and it works just as for the HTML plugins, namely, the integrator.xml Ant tasks adds an xsl:include of the plugin-provided XSLT module into the main PDF2 transform shell XSLT.

One thing that plugin-provided PDF2 transforms can do is define additional customization points: named attribute sets, named variables, and new XSLT modes, which can then be customized using the normal PDF2 customization mechanisms.

In the case of the pubmap extensions, I've extended the XSLT so that publication maps produce the same output as bookmaps (that is, a pubmap-d/chapter topicref goes through the same base processing as a bookmap/chapter topicref) and added support for DITA for Publishers-specific topic types, in particular, sidebar, which gets a box around it by default (XSL-FO 1.1 can't render multi-page floats, which would be the ideal way to render sidebars).

This enhancement to the PDF2 processor, along with the many other improvements made by the Suite Solutions team, makes it much easier to extend and customize the processor and, in particular, support new domains and topic types. The Customization process is as it was, but now you only need to use XSLT in your customization when you need truly customization-specific processing (for example, generating a publication-specific title page or copyright page).

Saturday, May 16, 2009

Why DITA Requires Topic IDs (And Why Their Values Don't Matter)

The DITA standard requires all topics to have an id= attribute.

Why?

The reason is simple: so you can point to elements within topics. And for
no other reason.


Surprised?

Most people seem to assume that topics are required to have IDs so you can
point to the topics. And they further seem to assume that topic IDs need to
be unique within some fairly wide scope (e.g., within their local topic
repository).

But that's not the case at all.

For the case of topics that are the root elements of their containing XML
documents and that contain no elements that themselves have IDs, the topic
ID isn't needed at all. In this case the topic can be unambiguously
addressed by the location of the containing XML document (e.g.,
"mytopic.xml").

In the case of topics that are not root elements and that are not themselves
pointed to and that do not contain any elements with IDs, again the ID is
not needed (because nothing points at the topic or its elements).

So why does the DITA standard require topics to have IDs?

It is because topics establish the addressing scope for their
direct-descendant non-topic elements.

By the DITA spec, to point to an element that is not a topic you use a
two-part pointer: {topicid}/{elementid}.

Without a topic ID it would be impossible to point to a non-topic element.

By requiring all topics to have some ID, it ensures that any non-topic
elements with IDs are immediately addressable without the need to also add
an ID to their containing topic.

In general by normal DITA practice, non-topic elements are given IDs only
when they are intended to be either used by conref or be the target of a
cross reference. Both of these tend to be carefully considered decisions
driven by editorial and business rules, not arbitrary author decision. Which
means you would tend to know, in advance of creation of a given element,
that it is a candidate for conref use or xref use, which means you know to
give it an ID at the time you create it.

By requiring that topics always have IDs it means that authors don't have to
worry about adding IDs to topics just because they also happened to put an
ID on an element. [In normal XML practice, elements are addressed directly
by ID within their containing document, which means it is sufficient to
simply put an ID on the element with no other dependencies. That is not the
case in DITA, which defines its own unique syntax for non-topic element
addressing.]

Because topic IDs are XML IDs (as opposed to non-topic-element IDs, which
are just name tokens and have no special XML-defined rules), any XML editor
will both require topics to have IDs and ensure that topic IDs are unique
within the scope of their containing document.

If topic IDs were not required, DITA-aware editors would have to have
special rules to know to require topic IDs whenever non-topic elements got
IDs and it would mean that generic XML editors would not ensure that this
important DITA rule was met (topics with elements with IDs must themselves
have IDs).

So the DITA spec requires that all topics have IDs.

But the fact that topics must have IDs does not imply that topic IDs need to
be either descriptive or unique within any scope wider than the XML
documents that contain them.

In the case where every topic is the root of its own document, the topic ID
can be the same *for every topic*. To make this point I have standard
practice of using the value "topicid" for the IDs of all my root topics.
There is absolutely no need to generate unique topic IDs for document-root
topics as a matter of standard practice.

The only other case is ditabase documents.

If you are using ditabase documents, stop.

Sorry.

There are some legitimate uses of ditabase documents, for example, as a
first-pass target for data conversions and as a way to hold otherwise
unrelated topics that need to be managed as a single unit of storage, such
as topics that exist only to hold reusable elements.

[NOTE: Using ditabase simply to allow the mixing of different topic types in
a single document during authoring* is the wrong thing to do. You should
have already created local shell DTDs and within those shells you can allow
whatever topic type mixing is appropriate for your local environment. There
is no need to use ditabase in that case and many reasons not to. See my many
other posts about why you should always create local shell DTDs as the first
step in setting up a production use of DITA.]

In that case, the topic IDs must be unique within the scope of the ditabase
element, simply because XML rules demand it. But the IDs need not be unique
beyond that scope and they need not be meaningful.

One of the implications of this is that if you always edit topics as
individual documents and never have nested topics you never have to think
about topic IDs
. Your topic document template should already have an ID
value and it can be something like "topicid" and there is no reason
whatsoever for that ID to ever be changed.

In the case where you do edit topics with nested topics (for example, you're
authoring more or less narrative documents or you've designed some topics
types that need nested topics to allow a bit of hierarchy where the nested
topics would never be meaningful in isolation) then you either have to
configure your editor to assign IDs to the nested topics for you (if your
document template doesn't already have the subtopics with IDs assigned) or
you have to think about it. But even in that case, the IDs can be pretty
generic, e.g. "st1", "st2", etc. The IDs in that case still don't need to be
unique beyond the scope of the containing document.

Labels:

Tuesday, May 06, 2008

Help Me Learn: How to Design a Solar Charging CIrcuit

I have an general interest in sustainable power systems (my home has a 5-star rating and a 3Kw PV system) but I am not an electrical engineer and have no useful understanding of electrical circuit design beyond very basic stuff (I know what a resister is but I couldn't reliably tell you how resistance relates to current and voltage).

I have two projects in mind that I'd like to pursue, both of which require a bit more knowledge than I have and I have no idea where to go to get the knowledge--all of my resources are focused on small-scale electronics (digital circuits, basic occilators, etc.).

My first project is to create a water feature that is variously powered by wind, solar, humans, etc. where all the different power sources contribute to charging a storage system which then drives an electrically-powered pump of some sort. I'm thinking of something like a 6-volt marine battery, something that can hold a good charge and produce enough current to drive a beefy motor.

What I don't know is how to design a charging circuit that will feed the battery from multiple input sources.

The other project I'm thinking about is modifying an RV to be electrically driven so that it could be, as much as possible, solar powered (e.g., for traveling about the American West during summer). That is, building an electric RV that would run off batteries for cruising and be recharged by a combination of solar, auxiliary generator (presumably a diesel engine that could run on waste vegetable oil or the most ecologically sound fuel available at the moment), or grid connection when parked.

It would need to enable a 200- to 300-mile range on a single charge to account for the (almost) worst case where you have no solar input and must recharge overnight from a campground. The worst case is no solar input and no grid access, so you'd have to run the generator in order to get to the nearest power source (or wait out the clouds without your beer getting too warm).

Some obvious questions are:

- Assuming an Airstream RV (chosen to minimize drag, even though they're frightfully expensive), how much energy would be required to provide a 200-mile range at 55 MPH?

- Assuming a more affordable typical RV, what would the cost from drag be?

- Given current solar panel technology, what output could be expected from the maximum area one could reasonably attach to an Airstream? Would it make sense to include some sort of fold-out panel system for use when parked (e.g., you're stopped for the afternoon at some tourist spot)?

- Assuming worst case of no solar input and no access to the grid, what size of generator would be needed to enable direct operation of the vehicle at say 40 MPH?

All of this would go to answering the first question, which is "is this even practical with today's generally-available and affordable technology?" If the answer to that is "no", then what advances would be required to make it affordable?

We could start with the presumption of a 50,000 USD budget, which is about what it costs to buy a full-sized conventional RV. So if I bought a used one and refit it, could that even be done for that budget?

Another consideration is the value of not buying fuel. With gasoline pushing 4.00USD a gallon and diesel already over that as of May 2008, a 3000-mile trip at say 8 MPG starts to add up pretty fast. That's roughly 1500 USD in fuel costs for that trip. At 4 dollars a gallon, I can recoup 15,000 USD in investment in 10 years of driving vacations. If fuel was at European rates that payback would of course be much higher (and it seems reasonable to expect that U.S. fuel will climb to approach European rates over the next 10 years simply because of both market pressures and increasing social acceptance of the true cost of our life styles in the face of global warming).

So I'm wondering if anyone can provide pointers to resources, online or otherwise, where I could start developing the necessary knowledge to start answering these questions?

I don't think any of this is particularly challenging from either a design or implementation aspect, I just have no idea how to go about learning about it efficiently....

Friday, April 18, 2008

Choosing an XML Schema: DocBook or DITA?

Richard Hamilton has presented a thoughtful analysis of when to choose DocBook or DITA, published on the Content Wrangler blog here: http://www.thecontentwrangler.com/article_comments/ choosing_an_xml_schema_docbook_or_dita/

I started to post the following as a comment to that post but it got long enough that I thought it better to post my full response here.

I generally agree with Richard's analysis as far as it goes, but I think it misses several important points that I assert tip the scales significantly in favor of DITA over DocBook.

If you are looking for a documentation schema that you can just pick up and use and you don't need the modularity features of DITA (that is, you don't need the functionality of DITA maps) then DocBook probably makes the most sense for the reasons Richard cites, namely that there are more element types of likely utility out of the box and the processing infrastructure is more mature and better documented.

However, if you know you need to add markup for your specific requirements or are developing a new XML application where things like markup tailored for local users or requirements is important or modularity is important, then DITA has a very clear advantage because it is so much easier to develop and extend custom document types from a DITA base than from a DocBook base.

The reason is very simple: DITA's specialization mechanism, coupled with the declaration set design patterns defined by the DITA architecture, make it as easy as it could possibly be to develop new markup structures. In particular, having defined specializations you may need to do nothing more in order to have documents that use those new types work with existing DITA processors, editors, CMS systems, etc.

DocBook cannot have this characteristic until such time as it either adopts the DITA specialization mechanism (which it could easily do--I worked hard to have the specialization aspects of DITA defined as distinct from the DITA element types specifically so that it could be adopted by other XML applications with a minimum of fuss) or adds the equivalent functionality using some other syntax [one limitation in the current DITA specialization mechanism is no good way to support namespaced elements--that will be fixed in DITA 2.0 but nobody has yet started to work in earnest on what that might be--this could be an opportunity for DocBook to take the lead since DocBook definitely has a namespace requirement.]

With any DocBook application, if you define new element types, there is no defined way to map those back to existing types and DocBook processors are not designed to handle new types by processing them in terms of some base type. That means that if you define new element types in a DocBook context you must update all processors that need to act with those documents even if all they need to do is nothing with those elements.

On the subject of narrative documents, there is essentially no practical difference between DITA and DocBook in their ability to support the creation of single-instance documents of arbitrary depth. This is obvious for DocBook (because that's what it was designed for), not so obvious for DITA (because it was designed for the opposite).

But with DITA all you need to do is configure your local doctypes ("shells" in DITA parlance) to allow topics to nest. For example, the simplest case is to simply allow generic topic to test. With that you can represent any possible narrative document structurally.

The only meaningful difference in this scenario between DITA and DocBook is that DITA requires the body of a section to be wrapped in a container (the topic body), while DocBook does not provide such a container (or at least it didn't last time I looked).

This is really a trivial difference.

For several clients who are doing publishing rather than technical documentation I have developed essentially trivial specializations that provide generic topics distinguished only by their topic type names but using otherwise generic DITA elements for content. I usually define a specialized topic called "subsection" that can nest to any depth. With that model you can represent documents as well as or better than you can with DocBook and you get all the other DITA goodness as well.

Finally, there is a free DITA-to-DocBook transform that is part of the free DITA Open Toolkit that allows you to use all the DocBook processing infrastructure with DITA-based content. This is used, for example, to use non-DITA-aware composition systems like XPP with DITA-based content.

Because DITA offers a number of very important features that DocBook does not, in particular specialization, modularity, and external links (relationship tables), and because DITA can be configured to work as well for non-modular documents as DocBook can, and because DITA lowers the cost of developing new element types as low as it could possibly be, I've come to the conclusion that DITA is the best answer for any XML-based document-centric application I've seen.

Just the fact you can get OxygenXML for almost nothing, define a completely new DITA specialization, deploy it to your local Toolkit as a plugin (a very easy operation once you know what to do, something I need to write a tutorial for), you can then edit documents using that specialization in a full-featured graphical, tags off editor with no additional work of any sort is pretty powerful. DocBook simply cannot enable that because it doesn't have DITA's specialization feature.

If DocBook adopted DITA's specialization mechanisms then this discussion wouldn't even be meaningful because DocBook would get all the value that specialization accrues to DITA and would still have the value of being a conceptually simpler model for documents.

Which raises the question: why doesn't DocBook simply adopt DITA's specialization mechanism? It would cost DocBook almost nothing to add and add tremendous value. It would not require DocBook changing anything about its current markup design, except to possibly back-form some base types that are currently not explicit in DocBook but would be useful as a specialization base. But that would only make DocBook cleaner.

Labels:

Sunday, February 17, 2008

XML is 10

The XML Recommendation is celebrating its 10-year anniversary, that is, the anniversary of the official publication of the Recommendation on 10 February 1998. However I think of XML as really starting in 1996, when the activity was revealed publicly for the first time at the SGML 2006 conference. I wrote about XML and its development at that anniversary here: Dr. Macro's XML Rants: XML: Ten Year Aniversary (And I discovered this post, which I had totally forgotten about, when I googled "sgml 1996 conference" in order to verify that my memory of the dates was correct. How sad is that? [or conversely, how cool is that?--you choose.]).

I will re-iterate what I said two years ago: while Tim Bray and Michael Sperberg-McQueen, as the editors of the XML 1.0 Recommendation, are most publicly associated with the XML it was Jon Bosak who made XML happen. It was Jon who put the "SGML on the Web" working group together, personally invited all the initial members, set the working rules that allowed us to work quickly and productively, and managed the political and procedural process of getting XML through the W3C. Jon knew what he wanted and knew the ingredients that were needed and knew how to put them together in a way that would most likely produce the desired result. In that sense he was like a chef producing a dish dependent on the complex interactions of different ingredients, a dish that is not a simple assembly task but one that involves carefully managed reactions and cooking times applied to a variety of ingredients where quality was a key determining factor.

Without Jon's drive, judgment, and leadership, the XML development process could have easily bogged down or been derailed in any number of ways. It would have taken only one spoiler or resistance from inside the W3C or simple poor management of the process to kill or delay the whole thing.

It's also important to remember that what we developed as XML represents absolutely no technical innovation. There is nothing in the XML 1 Recommendation that isn't in SGML, with the possible exception of well-formedness being sufficient (since SGML required the use of DTDs with document instances). The genius of XML, and the challenge in developing the spec, was figuring out what to leave out of XML. Each of us on the Working Group had our pet features, without which we felt XML would be at best crippled, at worst useless. I think we did a remarkably good job of not including features that were not essential.

In retrospect, I wish we had gone farther and left out DTDs and entities entirely, but of course that would not have been politically acceptable at the time and there would have been nothing to replace DTDs with (in fact, I still find it amazing that the XSD spec was ever finished given the challenge inherent in developing that specification given the wide range of requirements and constituencies driving it).

I think it's also fair to say that XML has succeeded far beyond any of our initial expectations. All we really wanted was a way to publish SGML data using Web technology. It never occurred to us that it would be embraced as a general-purpose data structuring and program-to-program communication format (for good or ill). I've always found it a little annoying that the vast majority of data using XML has nothing or little to do with documents in the sense of information intended primarily for human consumption. Whatever.

I suppose prognostication is expected at this point.

Where do I see XML going in the next 10 years?

I think it's fair to say that XML is entrenched and unlikely to be replaced any time soon. It's hard to imagine that any group would have the motivation and resources to build a general-purpose XML alternative given XML works more than well enough for most of the applications to which it is put. From an engineering standpoint, it would be a case of overoptimization.

In the domain of structured documentation I think that the DITA standard in particular will accelerate the adoption of XML for docment representation. The values have been well understood for decades and they aren't going to change. Because DITA, leveraging XML's deep and ubiquitous infrastructure, lowers the cost of entry of using XML for sophisticated document representation it can only serve to bring more enterprises and users to XML, users for whom in the past an SGML or even XML solution would have been prohibitively expensive. I find that very exciting. I don't remember well enough to know if that particular effect of XML was envisioned or even hoped for, but I think we all, even at that time, understood to some degree the power that Web technology had in general to make things easier and cheaper. But certainly lowering the cost of building XML parsers was a primary design driver, our mythical "graduate student with a weekend" to build a parser. That vision has definitely been realized.

In the domain of program-to-program communication it would not surprise me if something specifically designed for that task supplants XML, something like JSON. This is a domain where, because there is no particular great body of data, but only processing code, APIs, and support libraries, the engineering equation would make optimization more attractive: there's no question that XML is not the best solution for character-based serialization of arbitrary objects and data structures. I certainly wouldn't object to proposals to replace XML with JSON for those applications. The key is to understand that XML is still the best available solution for persistent data. I think a lot of people who use XML day to day forget (or never were told) that XML, via SGML, was originally designed to facilitate search and long-term, application-independent archiving of data. It is almost coincidence that makes that same application-independence useful for communication of transient data. Convenient but not optimal.

I fully expect to be able to do more or less what I'm doing now ten or twenty years from now. Whether I will be is another question, but so far, just when I thought I was completely bored of it, something new in the XML world has come along to re-energize my interest. And we're still struggling to build truly useful XML-aware hyperdocument management systems. Hopefully that won't be the case in 2018.

And lets not forget Dr. Charles Goldfarb, who's own singleminded passion, drive, and leadership produced SGML, without which XML (and HTML, for that matter) would never have happened. SGML turned 20 in 2006. It's largely now forgotten except by a few early adopters who have been using their SGML-based systems productively for ten or fifteen years now and had no compelling business reason to move to XML. But I remember.

Kids today....

Wednesday, January 23, 2008

FASB ASC U.S. GAAP DITA Application Is Live

[How many initialisms can I get in one post title?]

For the last year or so I've been working as part of a larger team at the Financial Accounting Standards Board (FASB), helping with the implementation of a DITA-based system to support authoring and delivery of the newly-codified U.S. Generally Accepted Accounting Principles (GAAP), the Accounting Stanards Codification (ASC).

I contributed design of the DITA topic and map specializations used for the codified content and also implemented automated data conversion from an earlier XML format used in the initial codification editorial process.

The live Web site is here: http://asc.fasb.org/home

I've posted some details about the project on the Really Strategies blog: http://blog.reallysi.com/2008/01/live-dita-appli.html

Update: I should have mentioned (but wasn't 100% sure of the details) that the CMS system was built on the empolis e:CLS product and the Web delivery platform uses the empolis e:IAS search platform. The empolis Web site is www.empolis.com. I was not personally involved with that aspect of the system and was not involved with FASB's technology selection process (I was brought onto the project after they had selected their core techology). The system integration and development work was done by Ovitas, empolis' chief North American integrator.

Labels:

Tuesday, January 01, 2008

I For One Welcome Our Cleaning Robot Overlords

As long as they keep the house clean.

My main gift this holiday season was a new Roomba 560 floor cleaning robot. This brings the cleaning robot population of Chez Kimber/Woods to 3, including our original Roomba 300 series and the Scooba.

I wanted the new Roomba because I found that, with a two-story house, it was just inconvenient enough to move the one roomba between floors that I was less likely to go to the trouble to run it at all. We also found that the 300 series couldn't really deal with the area rug in our livingroom (we have concrete floors with one big run in the livingroom) and that the noise of running it on the concrete floors was just a little too annoying. So in short, the robot was underused and the house tended to be not as clean as we would like (but couldn't actually be bothered to clean ourselves, not being what you would call obsessive house cleaners).

The 500 series promised to address all those problems with improved tolerance of things that would stop the 300 series (such as cords, furniture it tended to get trapped under, and the edge of the rug), reduced noise levels, and more effective capturing of pet hair (the 300 tended to just push around big clots of pet hair rather than sucking it up).

So far I have been very pleased with the 500 series--if anything it exceeded my expectations. It is significantly quieter, handles the rug just fine, doesn't get trapped where the 300 did (we have one big sideboard with these decorative bits at the base that the 300 would tend to get wedged under, the 500 never does) and seems to have a longer battery life.

So now the old 300 lives upstairs where it can focus on keeping our master bedroom clean and the 500 takes care of the downstairs.

As I said in my report on the original Roomba, these are amazingly well-engineered products that can serve as models and inspirations for all of us that build things for other people to use. Compared to the 300 the 500 is not signficantly different but there are a number of minor but important refinements that add up to a much improved user experience, from the simplified controls (got rid of one button that wasn't of much use) to the better brushes to the easier-to-empty dirt chamber. And all at a reasonable price.

And it makes cleaning the house fun.

Loopwing Wind Generator

One of my best Christmas gifts this year was this model of a loopwing wind generator from Tamiya (http://www.tamiya.com/english/products/75021loopwing)

I was unaware of this particular wind generation technology but it seems quite intriguing in that it claims to be better able to extract energy from light winds and takes less vertical space (and presumably is less dangerous to birds) than straight-wing wind turbines. This means you could have one on your house in your back yard and maybe not put the entire neighborhood in danger or violate local noise ordinances.

The kit itself went together quite quickly, the hardest part being cutting out the wings themselves, which actually required a little skill and care rather than just screwing the parts together (there's no gluing or anything).

The turbine drives a generator that then charges a little model car that plugs onto the top of the generator body. The energy is collected in a super capacitor that can then run the car for about 3 minutes on a full charge.

The connector to the car appears to be standard connector so it ought to be easy to build other things that can charged. I was thinking a little LED display that indicates the level of output or something or maybe something decorative. It would certainly be easy to adapt it to charging solarengine BEAM robots.

The generator doesn't swivel to face the wind but it would easy enough to mount it on a turntable with a wind vane if you really cared. I've got it mounted on a pipe that rises to about 5 feet and stands where the north side of our house forms a little wind tunnel that catches the northwest wind that tends to blow this time of year.

I find the prospect of having a home-sized loopwing generator interesting. We already have a 3K watt PV system on our house--it couldn't be that hard to add in the output from a small turbine, such as described here: http://www.treehugger.com/files/2006/11/loopwing_wind_t.php

Where we are in Central Texas we have a pretty reliable 5-10 MPH breeze most of the time and quite often stronger winds, especially in the spring and fall.

I think this year will start to see some interesting developments in alternative energy generation. Austin will be home to a new thin-film solar cell factory and is already home to a company trying to make high-capacity capacitors usable in electric vehicles. Taken together those technologies could make electric and solar-electric vehicles much more attractive in cost and range, not to mention the possibilities for home energy.

For example, imagine having a bank of capacitors that could provide the same power output as the little gas motors in all the three-wheel taxis in all of Asia and that can fit in the space currently used by the fuel tanks those vehicles carry (or otherwise fitted into available space).

Now imagine putting low-cost, flexible solar panels on the top of each of those three-wheels (they all have some sort of canopy on them) as well as on taxi stand shelters scattered around a typical Asian city. If most of those three-wheelers spend most of their time waiting for a fare, it seems reasonable to think that they could be mostly or entirely charged by their solar panels, taking from the main grid or a taxi-stand battery or capacitor bank only during peak times (e.g., morning and evening rush hour). Or maybe they could use one of those small fuel cells the Japanese are selling for home power use for peak-time charging where the grid is not reliable (or where natural gas is inexpensive).

The effect of such a change would be dramatic: a significant source of air polution would be eliminated, the need for fossil fuel would be significantly reduced in a part of the world where oil demand is rising much too sharply, and the operating cost of the taxis themselves would be reduced (assuming both that electricity costs per kilometer would be lower than fuel costs and that much of the operating energy would be from the vehicles' own solar panels).

With current battery technology, batteries could never be used to realize this vision: they're too expensive and too toxic and have too little energy capacity. But capacitors, if the current claims of orders of magnitude improved capacity prove out, could, because they have both a much higher energy density and lower toxicity (at least I assume they do) and they can charge very quickly, meaning that a taxi could do a 15 or 20 minute fare and then recharge in minutes at a recharging station or charge over say an hour using its own solar cells. That means a three-wheel taxi doesn't need to carry as much on-board energy capacity as it would for a battery solution.

Assuming the technology were there, what would it cost to, for example, provide a retrofit kit to every tuk-tuk operator in, for example, the Philippines? It would be several hundred million dollars at least (e.g., say $500.00 per vehicle) and as difficult to administer fairly and efficiently as any other aid project, but I would think that there would be lots of incentive from many parties to make something like that happen. And once the local population got used to the technology and had access to spares and second-hand parts so forth, the technology would be applied in many other creative ways. And at some point you'd hope it would be good enough to, for example, allow Philippine jeepneys to be retrofitted for electric power.

And cities like Manila and Columbo and New Deli would be much much quieter, with all those two-cycle motors replaced with electric drives.

Of course, the possibilities for other transforming uses of low-cost, physically flexible (that is, bendable) solar panels in developing and third-world countries are quite exciting. It will be interesting to see how the technology develops in terms of its economics and manufacturing environmental costs.

While there's no obvious direct connection between XML and alternative energy we, as an industry and as a society of large-scale computer system users are starting to realize that the collective cost of computing equipment does represent a significant fraction of our total societal energy draw. So the degree to which a technology like XML enables more people to do more with system, the greater the power such use will draw.

As a I write this, I'm sitting in a room with three computers running, drawing a couple hundred watts, as well as using Google and Yahoo, backed by massive data centers drawing terawatts of largely coal-produced electricity (except for those data centers built in Central Washington to take advantage of the cheap hydropower provided by salmon-habitat-destroying dams on the Columbia river and its tributaries). I'd feel a little better about that if I could at least make my urban house electricity self sufficient without spending too much more than I already have on alternative energy systems that make little economic sense under current U.S., Texas, and Austin energy policy (in particular, that, unlike Europe, utilities can buy back excess power at a steep discount from market rates, making the payback on my solar PV system 15 years or more *after* having half the initial cost rebated by the city and federal tax credits). Obviously we did it because we felt it was the right thing do and we could afford it, not because we had any financial incentive to do so).

Anyway, that's a long way from a cool toy that I got for Christmas....

Friday, October 12, 2007

I'm Bein' Macified

Through a series of more or less accidents I came to have physical possession of Really Strategies' one and only MacBook, purchased in order to support testing and delivery of software to a Mac-based client (which, considering that most of our clients are publishers should be most of them, but apparently hasn't been to date).

After some soul searching I have decided to make this Mac my primary development machine, giving up my oh-so-familiar Dell Windows-XP-based laptop.

We'll see how it goes. I must say that it's been quite an adjustment for me, somebody with nearly 20 years of Windows brain damage, to move to a Mac.

Of course it helps that most of the development tools I use are completely cross platform: Eclipse, Java, OxygenXML, Syntext Serna. It also helps that OS X is an *nx-based system under the covers, so I can get a command line that is familiar, although the configuration details are not (I've been using Debian-based distributions for most of the time I've used Linux). And other key tools have solid Mac versions (e.g., all the Adobe products).

I will even be able to get an RSuite server running on this machine, using an unsupported OS X build of MarkLogic.

I'm even starting to get used to the bizare control key mechanism, although it's still a struggle--it feels like trying to learn a new musical instrument that is just enough different from one you know to really hose you up.

I'm even writing this post using Safari, rather than Firefox, which I would normally use, but it's acting up this morning.

So wish me luck as I start on this new adventure in computing....

Labels:

Monday, October 01, 2007

Automatic Handling of DITA Docs In XML Editors

I'm in demo prep heck at the moment, trying to get some real DITA functionality built on top of Really Strategies' RSuite CMS product. One of the key challenges here is integrating XML editors to handle this use case:

Initial state: You are presented with some valid, conforming DITA documents in some locally configured and/or specialized document type, organized by one or more maps. You (and your repository and supporting tools) have never seen this particular set of documents or their DTDs before.

Step 1. Import map and all dependencies (including its DTD) into the repository

Step 2. Within the repository, find a topic to edit and push the "Edit with {name of integrated editor}" button in the repository UI.

Step 3. Editor opens with document, with all DITA support features applied.

It is that step 3 that is currently causing me a bit of pain. And it shouldn't.

The reason it's causing me pain is because every graphical XML editor has been built on the presumption that document types are relatively static and that some XML specialist will develop lots of doctype-specific setup and then deploy that setup once, followed by a long time with no changes to that setup.

Thus, if you're presented with new documents in a heretofore unseen DTD, they're not going to work in the editor until you go through the setup and configuration process for the new document types [And remember that DITA 1.1 requires at least six distinct shell types: map, concept, reference, task, glossentry, and dita, plus any additional specialized map or topic types you might have--that's a lot of DTD-specific configurations to set up, even if most of that effort is just copy and paste, it's still tedious and prone to the usual errors of catalog misconfiguration, filename misspelling, and so on.]

However, DITA totally chunks the assumption of static, well-known doctypes out the window. DITA says "hey, every shell is different, specialize away, apply agile approaches to developing and refining your local DITA-based DTDs, combine topics from everywhere willy-nilly, go nuts, have fun".

To support this DITA does something very important: it enables reliable auto-recognition of DITA documents, regardless of the details of the local configuration or the use of specialization.

DITA must have this mechanism because the specialization feature allows generic DITA processing to be reliably applied to any conforming DITA document. Because it can be, it should be.

For the DITA Open Toolkit this means applying default processing (transforms, filtering, etc.).

For editors it means applying default editing style sheets, enabling DITA-specific user interface components (e.g., "Insert topicref"), etc., if no more specific configuration already exists for the document or its shell doctype.

And there's no reason for any DITA-aware editor not to, except that, without exception that I can find, they've all implemented their document-to-functionality mapping in a way that doesn't enable this sort of dynamic association. The closest I've found so far is Syntext's Serna editor, which while it doesn't recognize specialized topics as DITA topics and apply its (very nice) built-in DITA support, it does make it a two-click process to manually apply their built-in DITA support. So kudos to Syntext. But it should be a zero-click process.

For this automatic process to work processors have to be able to examine any document they're presented with and reliably determine whether or not the document is or is not DITA-based. Note that the Open Toolkit presumes that what it's given is DITA-based because that's the only thing it is designed to process. But things like editors and CMS systems are, for the most part, completely generic and designed to handle any XML at all. So they cannot presume (or at least they should not presume).

The recognition of DITA documents cannot be based on the use of any particular DTD's system or public ID, because they'll all be different. You can't look for a particular well-known element type because the element types could be completely different from anything previously seen (let's imagine a specialization where all the element type names are in Chinese--there's nothing that prevents it and if I was a native reader of Chinese and wanted to create tech docs I'd probably do just that).

That means you've got to go by something invariant that is reliably in every document. In XML that really means the use of a particular well-known namespace. However, DITA element types cannot be in namespaces because the current DITA class mechanism syntax cannot support namespace-qualified names. Knowing that about DITA you might think "well what to do then?"

However, just because elements can't be in a namespace, it doesn't mean attributes can't be. And that's the trick DITA uses in DITA 1.1 to enable autorecognition of DITA documents, regardless of any other aspects of the DTD (it's public or system IDs, the element type names used, etc.).

This trick is the DITAArchVersion attribute. This attribute is in the namespace "http://dita.oasis-open.org/architecture/2005/". Any document that includes this namespace is almost certainly a DITA document, especially if the namespace qualifies an attribute named "DITAArchVersion" and the element on which that attribute occurs has a class= attribute conforming to the DITA class attribute syntax.

This means that regardless of the actual DTD or schema a DITA document uses, it can be recognized as being a DITA document. That means that you can then reliably and usefully apply default DITA processing to the document without having specifically configured its particular DTD or schema as being a DITA schema.

That is, the behavior I expect from any editor that claims to be DITA-aware is that if I open any conforming DITA document, regardless of what declaration set it happens to use, I should get all the default DITA-specific stuff automatically.

While the most robust implementation of this behavior would make all the checks described above, it is probably sufficient to assume that if a document's root element has a DITAArchVersion attribute or if the root element is named "dita" and any of its children have a DITAArchVersion attribute, then the document is a DITA document.

The DITA spec only really recognizes three possible configurations of elements in a conforming DITA document: root of base type "map", root of base type "topic", or root of type "dita" [the dita element is not specializable in DITA 1.1] where its direct child elements are of base type "topic"--anything else is not a conforming DITA document (although it may contain individually conforming topics or maps) and you have no obligation to apply DITA-specific features to it (although you could if you wanted to).

That's by way of saying it's probably good enough to just look for the DITA namespace anywhere in the document and go by that, but it could lead to false positives in cases where the document is not strictly a conforming DITA document.

And it would be really cool if editors provided defined extension points by which this type of recognition could be added to doctypes as plug-ins to the editor.

Labels: ,