How Square Roots are Calculated in Quake

This has been sitting in my drafts folder, waiting for me to read the article, learn about it and summarize it here.

I took a quick scan today to make sure I wasn’t biting off more than I could chew when I stuck it in the queue. Unfortunately, I don’t think I understand it better than the linked-to-writer and I’m not interested in spending the time to become so.

Here’s the attention-grabbing part:

My Understanding: This incredible hack estimates the inverse root using Newton’s method of approximation, and starts with a great initial guess.

The trick has to do with how floating point numbers are stored in a computer, something I’ve actually blogged about.

Who said math wasn’t useful!

I Agree

Eventually, maybe everyone will truly have to be able to code to effectively do any office job.

More here.

I love reading about people picking up coding later in life. I consider myself a member of that group. Learning coding when I should rightly be doing other stuff.

I Have An Origins Fetish

When I was a kid, I read a lot of fantasy novels. Super dorky to be sure, but there was one particular aspect of these fantasy worlds that really captivated me: the backstories.

I loved reading backstories. I loved reading about context. I read Lord of the Rings when I was a kid and was transfixed by the rich history hinted at throughout. And when I found out about The Silmarillion, I devoured it. Twice. I’ve reread all of these books, but I’ve reread the Silmarillion about two or three times for every time I’ve reread LOTR.

I used to read about Mythology and, even though I’m not at all religious, I love reading the Old Testament of the Bible. When I find myself in a Christian church, I immediately look for one. By the way, you’d be surprised how FEW chruches actually have Bibles in them.

Actual, real history is cool, but only the grandiose big-picture parts. As soon as the discussion wanders into pottery and art and cultural minutiae, I quickly fall asleep. Not interested. I only care about things that affect entire societies and shape generations of lives. My favorite history books so far have been Europe Between The Oceans and GG&S. Big big big.

So now I’m learning programming and I’m feeling the itch: what’s going on under the hood? Who made the decisions for how things work and why? When? Python is where it started, but I’m drawn towards the deeper, darker and older aspects of computing. I’m fascinated yet terrified by C, Assembly and the (currently, for me) mystical universe of hardware engineering*.

This big little slideshow was interesting in that it’s JUST accessible enough for me to get through it and candy-coats consumption of the scariness of C. It’s scary because (I think) that you’re speaking to the computer almost directly and computers don’t all take commands in the same way. I would never want to start learning programming by learning C and I’ll leave the expertise to the fat neckbeards.

But I can’t help myself from learning the programming backstory.

*One day I will get this and use the excuse to teach myself how it all really works.

Irritated Rant – Stanford DB-Class Course Notes

Got an old-fashioned ass-whooping this morning on my XML quiz. I despise this kind of test, though, and have always been terrible at them.

Let’s take it from the top: XML is a standard for machines to read data and so is an excellent example of something humans are crap at. To write valid XML in one swoop on a test, for example, you need to memorize a variety of rules.

Such as: make sure that when there’s a <xs: sequence> opening tag for a subelement, the actual elements need to appear IN ORDER.

Or this one: ” avalid document needs to have unique values across ID attributes. An IDREF attribute can refer to any existing ID attribute value.”

Who the #$%# cares? When you’re actually implementing XML, you are probably using some kind of developing environment that either makes these kinds of errors difficult to make or very easy to identify and fix quickly. Why are we teaching people to do something that COMPUTERS ARE A BAJILLION TIMES BETTER AT?!?

Why not make me take a test on grinding coffee beans or loading ink into a ballpoint pen? ‘Cause, you know, these are things that are important for an office to function as well. No? The division of labor means that we pay others to do these tasks for us? Well, howdy-effing-do.

Now, there MAY be a valid argument that goes like this: memorizing all this garbage really plants an understanding of what XML is good at in my head. Sometimes sequencing elements is really important for a database.

Bollocks. These things are tested because they’re easy to test*. Period.

I’m considering dropping this class.

*and by that I mean easy for machines to grade. Well, I don’t want to learn how to be an effing machine. That’s what I buy machines for.

-=-

Update: I got quite a lot of pushback on the discussion forum for posting something similar to this. Hard to say whether it’s my aggressive and off-putting personality or whether my views actually have no merit.

I was (implicitly) called lazy and one guy said that “somebody has to build the validation tools”.

Well, I certainly am lazy, but mostly I’m just a douchebag. Anyway,  here is my response:

Unpopular sentiment, it seems. Maybe it’s just my unpleasant tone. Let me try again.

I’m not sure I understand the first reply, but I like a lot of the second. Building an XML validating tool is a much more creative and effective way of learning what is and is not valid XML than the given assignment. I’d rather spend 5 hours doing that than 30 minutes wrote-memorizing tag syntax.

Is it so wrong to expect more of a university course than this?

How about testing me on these questions: 1. When should XML be used vs some other standard? (I think this is what the first response is getting at). 2. What are the limit cases for XML use and why might it break down? 3. What are some examples of instances when XML was used and it failed, or was successful?

I remain disappointed. Am I really so alone on this?

What Goes With Oatmeal? Today It’s Linear Algebra

Linear Algebra… doesn’t it sound so impressive?

When I was in my last year of high school, we had three options for math courses: calculus, statistics (called finite for some reason) and linear algebra. Honest to god, I skipped LA because it sounded so daunting (and wasn’t a strict prerequisite for any university programs I applied to).

So often intimidating jargon masks very simple procedures and concepts.

Well, I’m learning LA over breakfast today because matrix multiplication is the fastest way of comparing linear regression functions’ effectiveness (that’s what we’re hinting at, anyway). Matrix multiplication is actually so simple I’m not even going to bother with notes.

What’s interesting to me is why it’s useful in this context. Quite simply, it’s useful because somebody (somebodies) spent a bunch of time building super-fast matrix multiplication functionality in every imaginable programming language.

Now, I don’t know why people have designed super-optimal implementations of matrix multiplication, but it’s a pretty awesome public good. Did they do this before Machine Learning made selecting from among various linear regression algorithms was a problem to solve?

Realistically, it was probably a bunch of kids looking to do an awesome PhD dissertation: why not build a super-optimized matrix multiplication library?

Learning by solving problems. That’s what it’s all about. Hat Tip to Alan Kay.

A Teaching Moment

To my everlasting surprise, somebody made it far enough through some of my course notes to understand what on earth I was going on about.

I was forwarded a link to a real life implementation of xml. Actual examples are always nice to think through the implications of the theory.

But be forewarned, ye hapless Web denizens, this is a discussion not fit for all. Formatting reports for transferring retirement-related employee data among federal agencies. Has quite the ring to it, non?

Here’s the question: why and how do people use these tools?

The purpose of all this nonsense is to get machine readable data into the mothership system. Surely they’re choking on the fedex bills and warehouses of paper files. It’s the friggen 21st century after all.

XML does give you machine readable data. And it has this other benefit: it doesn’t really matter how you create it. Each government agency could format a report out of a sophisticated relational database or pay a legion of underemployed construction workers to handcode a text file. Either works as long as the format checks out.

So XML just plugs into your existing system (even if it’s a system of handwritten forms and carbon copies). Database systems are not quite so forgiving. You need a “new system”, in the most horrible, time/cost draining meaning of the term.

In this case, I’d speculate that the xml format is considered an early first step. It’s hardly feasible to lay the redundant paper-form jockeys off any time soon. Unions will make sure of that. But having a continuous corporate structure holds you back, too.

In more lightly-regulated process-heavy industries, most companies were either acquired or driven out of business before the haggard survivors finally completed their metamorphosis, which is actually never really complete. Google ‘COBOL programming language’ for an taste of the eternal duel against legacy software. And paper files?! Machines barely even read that crap. Try finding (with your computer!) any reliable data collected before 2000 (ie the dawn of machine history). Oh, you found some? Well, hide your grandkids, ’cause that shit was INPUTTED BY HAND!

Anyway, back to Uncle Sam’s pension files. The endgame is obvious: direct API links between the central system and every payroll/HR system in each office. This eliminates costs (jobs) and will improve accuracy. Good stuff.

Until then we’re still building XML files and presumably emailing them around. I can hardly be critical here as I’ve only just started to see the emergence of API links between insurers and reinsurers. No XML schemas, though, because they’re using a type-controlled relational database. Fancy way of saying they keep the data clean at the entry point: pretty hard to soil those databases. As it should be.

To my novice eye the system impresses. Flicking through the documentation suggests they might want to cool off on the initialisms and structured prose as it reads a bit like an engineering manual from the 60s. But engineers they probably are (and targeting an engineering audience to boot), so I’m probably being unfair.

Bless ’em.

Machine Learning: first impressions

Wow, this is pretty cool stuff. Some notes:

There are two kinds of machine learning: supervised and unsupervised.

Supervised learning is literally composed on of running regressions on datasets you know something about. That’s it. They further break the regressions down into binary variable regression (classification problems) and plain-old single/multi-variate regression. The point of regression, of course, is to estimate a smooth function that describes a lumpy dataset.

Unsupervised learning is where the sex is in this field. That’s google news (clustering stories together that are ‘about’ the same topic) and various other kinds of data mining. The idea is that you get a dataset and ask the computer to find a pattern.

There were quizzes in this, too, at which I did distinctly better than yesterday with the databases. I chalk that up to a better teacher setting questions up. As always, there are instances of false precision leading to a binary result (WRONG ANSWER, you idiot!), when in the real world I’d probably have gotten away with my approach.

For instance, there was a question that asked to identify the type of regression problem: one was predicting whether email was spam/not spam (obviously a ‘classification’/binary problem) and the other was to predict how many of a warehouse of goods would be sold or not sold in three months. I said that was also a classification problem, but the instructor thought it would be a normal regression problem. Could probably go either way, but I lose.

Finally, I am going to need to learn another programming language, Octave, which is apparently an open-source version of MatLab (The days of building programming languages and selling them for money are long gone.). Great.

Next up is a linear regression tutorial then a linear algebra lesson. I never took linear algebra, so I am distinctly not looking forward to the amount of time I’ll probably need to spend on this.

But I press on nonetheless.

What (Meta Skills) Do Computers Teach Us?

Easy: programming. But what does that do for us?

I just read Alan Kay‘s essay: The Computer Revolution Hasn’t Happened Yet. There’s also a video of the lecture on which this essay is based. Haven’t watched it yet.

Here’s the synopsis: when the boys at PARC in the 70s were inventing just about every major component of the personal computer we use today, they had gigantic aspirations. They had a printing press-style revolution in mind and Kay is unimpressed with humanity’s progress using those breakthroughs. He figures we’re still only scratching the surface of its power.  I agree.

Here’s how he rationalizes it:

One way to look at the real printing revolution in the 17th and 18th centuries is in the co-evolution in what was argued about and how the argumentation was done.

Increasingly, it was about how the real world was set up, both physically and psychologically, and the argumentation was done more and more by using and extending mathematics, and by trying to shape natural language into more logically connected and less story-like forms.

The point here is:

As McLuhan had pointed out in the 50s, when a new medium comes along it is first rejected on the grounds of “too strange and different”, but then is often gradually accepted if it can take on old familiar content. Years (even centuries) later, the big surprise comes if the medium’s hidden properties cause changes in the way people think and it is revealed as a wolf in sheep’s clothing.

So, the computer is going to literally change the way we think about and solve problems and this hasn’t really happened yet.

Big thought, that one. I like it a lot.

Kay would answer my questions at the beginning of this post as follows, perhaps: computers let us learn programming, which allows us how to simulate stuff, to play with ideas.

He spends quite some time on his work with children learning science by programming computers to test out ideas of their own. To learn the best way one can learn: by failing. Or let’s dust off an old metaphor: the printing press let us learn by watching, the computer allows us to learn by doing.

If this is right, it means that tomorrow’s people will simply have a better intuitive grasp of difficult concepts: they’ll be smarter. Is it crazy to say that a pedagogy with computer games as its centerpiece will revolutionize education and the world? Sure sounds a bit crazy.

Kay laments that our society sees a computer and thinks ‘super-TV’. Ouch, but he’s right. Remember the One Laptop Per Child program? Kay’s affiliated with it, unsurprisingly. When I heard that I had a flashback to some of the commentary: “What on earth will kids do with a cheap computer when they don’t have water? Watch YouTube?” Imagine Kay’s exasperated reply: “they’d learn, you fool!”

Because they aren’t super-televisions. Oh, no.

Computer literacy was once learning to type. Mechanical skills?!How laughably 19th century. More recently it meant learning how to open a document in Windows: pshah, that’s like teaching one book in an English class. Better to teach the kid to write!

You pick up those basic skills as you go along. The point is that we can’t rely on everyone teaching themselves. Computer literacy means literally learning how to read to and from computers. It is learning programming.

And it’s the future.

Databases: Intro, Relational and XML (Hierarchical)

First day wasn’t so bad. I watched about 40 minutes of video at 1.2x or 1.5x (I’d probably have falled asleep at 1x pace) and learned a bit.

There are four kinds of developers in the database programming world: builders, designers, programmers and administrators.

This course is not about (#1 above) building a database system, which would involve designing the database interaction with the physical system in C or Assembly or something crazy like that. Nor is it about (#4) maintaining a database in use, which involves optimizing resources, minimizing downtime and keeping things running.

What we’re learning about are choosing a database type, designing schemas, writing queries and incorporating the structure into a program. In other words, this is a database course for people that build database functionality into a program.

There are two choices for databases today: relational databases (in my case, SQLite or some other system that uses SQL as its query language) or hierarchical databases (XML, which means a flat text file formatted in a certain manner).

We were just getting into a description of what XML is when a quiz popped up in the lecture, which was neat.

The questions, however, were ridiculous, not least because I got them all wrong. Here are some things I learned from the quiz (honestly, none of this was discussed in the lecture):

  1. ALWAYS use relational databases when you can. In particular, when the data structure is fixed (a few ‘columns’ and lots of records), relational is the default. Why this is isn’t discussed (grr), but I suspect that relational databases are just much faster.
  2. XML is useful when the data is ‘hierarchical’, which means that data can be easily described as subsets of other parts of the dataset. The example they used in the question was a family tree. For this question I chose ‘XML only’ when the right answer was ‘either XML or relational’. Again, relational is the default and only stray from it IF YOU HAVE A DAMN GOOD REASON TO.

I also learned what ‘relational algebra‘ is. Turns out it isn’t as scary as it sounds. Algebra, I’m reminded, simply means a vocabulary of symbols for concepts. Relational algebra, therefore, can be thought of as a series of symbols that represent actions a database program performs (pi defines the set, sigma selects the data, blah blah blah). I’m hoping it’s somewhat intuitive.

I’m happy I got this out of the way early because seeing the words RELATIONAL ALGEBRA staring out at me from the syllabus was giving me the heebeejeebees. The last thing I need to do is brush up on lots of complicated math.

It it takes FOREVER to relearn math.