Mark Zuckerberg’s Livejournal Entry While Making Facemash
8:13pm. <!- – Jessica Alona is a bitch. I need to think of something to maketo takemy mind off her.- – > I need to think of something to occupy my mind. Easy enoughnow I just need an idea…
9:48pm. I’m a little intoxicated, not gonna lie. So what if it’s not even10pm andit’s a Tuesday night? What? The Kirkland facebook isopen on my computer desktop and some of these people have pretty horrendous facebook pics.I almost want to put some of these faces next to pictures of farm animals and have peoplevote on which is more attractive. It’s not such a great idea and probably not evenfunny, butBilly comes up with the idea of comparing two people from the facebook, and onlysometimes putting a farm animal in there. Good call Mr. Olson! I think he’s onto something.
11:09pm. Yea, it’s on. I’m not exactly sure how the farm animals are goingtofit into this whole thing (you can’t really ever be sure with farm animals…), but Ilike theidea of comparing two people together. It gives the whole thing a very Turing feel,since people’s ratings of the pictures will be more implicit than, say, choosing a number torepresent each person’s hotness like they do on hotornot.com. The other thing we’regoing toneed is a lot of pictures. Unfortunately, Harvard doesn’t keep a public centralizedfacebookso I’m going to have to get all the images from the individual houses that people arein.And that means no freshman pictures…drats.
12:58pm. Let the hacking begin. First on the list is Kirkland. They keepeverythingopen and allow indexes in their Apache configuration, so a little wget magic is allthat’snecessary to download the entire Kirkland facebook. Child’s play.
1:03am. Next on the list is Eliot. They’re also open, but with no indexesinApache. I can run an empty search and it returns all of the images in the database in
asingle page. Then I can save the page and Mozilla will save all the images for me.Excellent. Moving right along…
1:06am. Lowell has some security. They require a username/password comboto access the facebook. I’m going to go ahead and say that they don’thave access to themain fas user database, so they have no way of knowing what people’s passwords are, andthe house isn’t exactly going to ask students for their fas passwords, so it’s got to besomething else. Maybe there’s a single username/password combo that all of Lowellknows.That seems a little hard to manage since it would be impossible for the webmaster totellLowell residents how to figure out the username and password without giving them awaycompletely. And you do want people to know what kind of authentication is necessary,so it’s probably not that either. So what does each student have that can be used forauthenticationthat the house webmaster has access to? Student ids anyone? Suspicions affirmed –timeto get myself a matching name and student id combo for Lowell and I’m in. But therearemore problems. The pictures are separated into a bunch of different pages, and I’mway toolazy to go through all of them and save each one. Writing a perl script to take careof thatseems like the right answer. Indeed.
1:31am. Adams has no security, but limits the number of results to 20 a page.All I need to do is break out the same script I just used on Lowell and we’re set.
1:42am. Quincy has no online facebook. What a sham. Nothing I can do aboutthat.
1:43am. Dunster is intense. Not only is there no public directory, butthere’sno directory at all. You have to do searches, and if your search returns more than 20matches, nothing gets returned. And once you do get results, they don’t link directly to the images; they link to a php that redirects or something. Weird. This may bedifficult.I’ll come back later.
1:52am. Leverett is a little better. They still make you search, but youcando an empty search and get links to pages with every student’s picture. It’s slightlyobnoxious that they only let you view one picture at a time, and there’s no way I’mgoingto go to 500 pages to download pics one at a time, so it’s definitely necessary to break outemacs and modify that perl script. This time it’s going to look at the directory andfigureout what pages it needs to go to by finding links with regexes. Then it’ll just go toallof the pages it found links to and jack the images from them. It’s taking a few triestocompile the script…another Beck’s is in order.
2:08am. Mather is basically the same as Leverett, except they break theirdirectory down into classes. There aren’t any freshmen in their facebook…how weak. So I