Welcome to the Allmydata blog!

Hey out there and welcome to the Allmydata blog. If you don’t know what we do, we are basically an online storage and backup company that allows for secure, reliable, and cheap file protection.

We’re planning on using this forum to communicate what we’re working on, some problems we’ve had and want to share so others don’t make them too, and other ideas about things related to the market we’re trying to serve.

Please let us know what you think with comments on our product, market, or anything else that you think is relevant.

Talk soon,

Peter

Add comment September 21st, 2006

About

Allmydata is dedicated to providing secure, online storage and backup for your important files. We started because we have all been victims of hard drive crashes, theft, file corruption, or just plain forgetfulness. As more and more important information moves online, secure and reliable storage becomes a necessity and we hope that our solution helps.

Add comment September 21st, 2006

Beta 1.7 in the works

I just wanted to give a quick update on our Beta 1.7 progress. We released Beta 1.6 on 01 August 2006 and since then have been working on improving our memory and cpu usage on the client, implementing incremental Outlook backup, and maturing our API for use by external applications (web client, portal views, etc.). In addition, you no longer have to share space in order to buy space on our storage grid.

We’re looking forward to the 1.7 release as we push towards taking the product out of beta testing. Let us know if there are any other features you’d like!

Last, if you ever want some kick-ass pizza, try Goat Hill Pizza at 18th and Connecticut in San Francisco, the fuel of Potrero Hill sofware developers.

Talk soon,

Peter

5 comments September 21st, 2006

Installing and configuring blogging software

The blogging software we are using is from Wordpress which is an open source platform associated with a for-profit hosting company. The software is super easy to install and is just a group of php files pointed at an existing mysql database. Note that you do have to install mysql (apt-get mysql-server on debian) or configure the Wordpress software to point to an existing one. Just create a database called wordpress, update the user and password in the wp-config.php file.

We also moved our installation fairly easily from our initial staging machine to another by dumping the database (mysqldump wordpress > wordpress_dump.sql) and then loading it in to the new one.

Last, it was also easy to change themes. There is a big library of them around on the web, and it’s fun to play with different ones. We chose one and then modified the colors to match our current color scheme. In fact, it is so easy to update, create, and track content that we are thinking seriously about using Wordpress as the content management system for our entire site and then making modifications for our special services (login, billing, etc.).

Add comment September 22nd, 2006

Measuring health on a P2P mesh

One of the most interesting things about a peer-to-peer network is trying to figure out how healthy it is. Specifically in our case, we have to guarantee file recoverability to within some reasonable limit that is equal to or better than comparable services. This includes standard systems operations monitoring (cpu, network, db, applications), but also specific end-to-end and recoverability tests on the mesh itself.

In our mesh one of the main areas of measurement is around file health and repair. To determine file health, any peer on the network can (and does) ask about the health of specific files, basically asking - if I wanted to recover this file right now, could I? In addition to that basic binary measurement, we keep track of how spread out the file is on the mesh and how resilient it is to mesh damage. Say we started out with a given file X.txt that was encrypted to a data blob X, and then encoded into 100 pieces to be spread among the peers. In order to fully recover the file, we would only need recover any 25 pieces from the peer mesh. To determine the health of X, we just ask the mesh how many pieces are available and if that sinks below a particular level (in this case say only 50 pieces are available), then we invoke a file repairer to increase the number of pieces on the network back up to the desired amount (in this case around 100).

We then just keep track of the health of files over time in a database and then any peer can later choose to repair any damaged files before they become unrecoverable. In one of my next posts I’ll talk about what we do to repair files.

Talk soon,

Peter

1 comment September 26th, 2006

Data security and integrity in a public P2P mesh

Any time somebody holds your assets for you, questions around both security and recoverability come up. How do you know that your asset will be safe? How do you know that you can retrieve your asset at a convenient or necessary time? If you lose your key(s), are there other ways of obtaining access to your asset? How much does it cost to store, monitor, and recover your asset?

These questions (and probably many others) are very relevant with Allmydata as we are providing the mechanism to store very valuable assets - your digital data including photos, documents, videos, and more. So how does Allmydata address the above questions and provide a valuable secure data storage service without compromising the integrity of the data and the privacy of the users?

In our next few posts, we’ll examine the different facets of the secure storage solution, including topics such as:

  • Encryption - Where does it occur and why?
  • Resilient storage - How safe is the data and how resilient is it to hardware failures?
  • Privacy - Who has access to your data?
  • Recovery - How does one recover data? How quickly can it happen?

Talk soon,

Peter

Add comment November 29th, 2006

Mobile mesh!

Just a quick note from the train. I’m writing this entry and posting it via my Verizon EVDO card, but also I am running a node on our storage mesh and watching the performance and behavior of the product on a link with variable bandwidth and also intermittent drops in service as we pass through tunnels. So far no major problems have occurred, but it has prompted me to take a few notes on how to more gracefully handle network transitions.

Back to work now,

Peter

Add comment October 5th, 2006

Benefits of open source

I recently had dinner with a bunch of folks to talk about different open source business models and how they could add value for the business and end users. My initial impression is that there are four major potential benefits from an open source model.

  • transparency of roadmap can lead to trust
  • ease of installation and customization can lead to widespread adoption and barriers to competitive entries
  • public vetting of ideas can point out bugs, uncover security holes, and offers new ideas
  • recruiting high quality software developers

If any of the above points is of value to the business, then some type of open source model might make sense to pursue. However, open source is not some magic incantation that coerces hordes of happy developers to write, fix, and maintain your code. In fact, most of the businesses that I’ve spoken with in this area (like JBOSS/Redhat, Alfresco, etc.) seem to be run very much like any other software shop and have the same issues to deal with.

The one area that open source seems to help out a lot in is during the initial stages of a company when you don’t have the resources to build a big sales team but need a customer base to test and improve your product. By making it easy for any potential customer to download, examine, and modify your project you can achieve quick iterations of the product without large investments in sales, support, and testing. Couple this with a nice transparent roadmap of where you are planning to take the product and you have a situation in which the end user has had all barriers to using the product removed and can make a very clean decision on whether or not the product adds value in his environment.

So what is the difference between an open source company and a proprietary company that provides a really nice roadmap, takes suggestions from its user base, and has a great API to extend the product? Does open source enforce good integration interfaces and coding standards? Does it engender so much goodwill that open source companies can charge a premium for their services built around the project?

Last, what type of licensing model should be used? There are lots of different types of models out there, ranging from very open yet obliging (like GPL) to hybrid (free for education, paid for business) and various mixes. I’m planning on doing a bit more research into this and maybe write a post or find a link to an overview of this area.

Talk soon,

Peter

1 comment October 10th, 2006

Web clients, native clients, mobile clients …

One fun problem to wrestle with as a startup when you have limited resources is how to deliver a useful interface to your service or product. If possible, you can determine the problem you are trying to solve, the target user personas, and then choose the interface(s) that will suit them best. At Allmydata, we did the following breakdown:
Problem

For individuals and small businesses today, data backup solutions are too cumbersome and/or expensive for them to use, thus they end up losing valuable information because they do not back it up.

Target Users

  • Alex: A freelance, professional journalist, travels, needs to back up contracts, articles, photos, knows his way around a laptop but just wants an easy, reliable way to backup his documents in case of a hard drive crash or stolen laptop.
  • Jennifer: Alex’s wife, professional mom, runs the local parent teacher’s association (PTA) and Multiple Sclerosis chapter, keeps household finances, doctor’s reports, kids pictures and videos online and doesn’t want to lose them, usually uses her desktop at home, will share an account with Alex.
  • Seth: Just graduated from university, has a big music and photo collection plus all of his papers and projects from school, doesn’t have a lot of disposable income but has extra disk space and his home computer is online nearly all of the time.
  • Nadia: A technical guru at a media content distributor, needs to have access to a large pool of cheap storage for all of the videos, pictures, and other content they need to store and distribute online, needs a simple programming interface to authenticate, upload, download, and examine files

Given the above information, we’ve done some market research to obtain the size and quantity breakdown of the of data Alex, Jennifer, and Seth feel is important enough to back up. We also can get a feel as to what type of computer they are running (Alex most likely has an IBM or Dell laptop running Windows XP, Jennifer has a Dell or generic PC at home, and Seth has a generic PC). In addition, we’ve gathered information about where they will want to access there information. For example, Alex travels a lot and would really like to have a web interface that he can access his files from in case he left his computer somewhere or just doesn’t feel like booting it up. Seth is thinking about getting a new Mac before his student discount runs out and needs to know if he can run the product on more than one platform (an interface to his roommate’s Linux box would also be cool but not necessary).

Interfaces

From the above descriptions, we’ve decided to concentrate on providing a native Windows interface and a web client. The native Windows interface allows all three of our target users to interact with the product as if it were just another drive on their computer. They can then drag and drop files, cut and paste, or use the normal Windows metaphors for moving files to and from the disk. The web client will allow us to both address users who are not running Windows and/or trying to access their files from another location.

On the partner side, Nadia needs a nice programming interface so she can reliably store, examine, and retrieve large amounts of data on behalf of her user base. For this we have provided an XMLRPC interface that can be enabled for partners and used to either access a storage grid that we manage or one under their management.

Conclusion

In short, with limited resources it seems best to identify a few key user personas that you want to target, determine and design what types of interfaces are necessary for them, and then make sure not to try and include everything in your first UI. It’s much easier to add functionality later than to try and explain to a user the myriad of buttons, checkboxes, and statistics of your multi-fuctional cross-platform UI when all they want to do is back up their files.

Talk soon,

Peter

Add comment October 25th, 2006

Multiple platform support via a web client

With the increase in functionality of web applications, it is becoming more compelling to create only a web interface for certain types of products. With our product, most of the work is in the background trying to upload and download files, looking for files that have changed, and scheduling backups. All of that work is done in a multi-platform language (Python) that is easily deployed on multiple platforms. The front-end is really only about configuring a few parameters (user/pass, what to backup, buy more space, etc.) and managing your virtual drive which can easily be handled by the new batches of widgets available to web designers. For more involved UI interaction, a native gui still has a lot to offer, but even this is being eroded by new desktop/web offerings from Adobe and Business Objects (CXNOW).

Have fun,

Peter

Add comment November 6th, 2006

Previous Posts


Categories

Links

Feeds