-
All responses Most smiled responses
-
asked by ffmaer
I thing the Django / Rails decision is really a matter of personal preference. You can hire great programmers for both, lots of great sites have been built with both, there are tons of libraries and tools for both etc etc.
I haven't used Drupal so I can't really comment on it. My impression is that it's much better suited for portal type applications where you can leverage the large number of existing plugins, but again, I shouldn't really comment as I'm speaking in ignorance.
We've ended up using less and less of Django over time after replacing chunks with our own custom bits. We still heavily use the templating system though. -
asked by ivankirigin
Yep, you can get recommendations for facebook or twitter users. Twitter user data is public since it's based on the public twitter follow graph. Predictions for Facebook users require the Facebook user to OAutht to us in order to authorize you accessing their Hunch predictions.
-
asked by ivankirigin
We think of everything as either a "user", an "item" and a "preference" between a user and an item. So in this case you might represent dropbox usage as saying that every type of file is an item and there is a preference of "liking" or "4 stars" between the user and the file type. That way you can model that some people "like" PowerPoint files and other users like AutoCad files while others like tar.gz files.
-
We currently use 50. It's kind of an arbitrary number. More factors do increase our predictive ability, but they also make the system slower. Too many factors and you risk over-fitting to your training data. 50 is a nice trade off between accuracy and speed for now.
-
We don't, though not for any real reason than lack of time to try it out. We use MySQL for most of the site and a bunch of custom stuff for batch processing the taste graph edge and node lists.
-
asked by pims
yeah, for internal graphs. for user facing charts we used to use ChartDirector from http://www.advsofteng.com/ though we don't have any user facing graphs any more.
-
There's not really any one thing I recommend. Fundamentally, you have to understand how EVERYTHING on your site works (hardware, dbs, web servers etc) in order to understand how bottlenecks form.
Second, I recommend reading websites tech blogs to see how they really work and reading stuff like http://www.mysqlperformanceblog.com/ and http://developer.yahoo.com/performance/ -
django 1.2.3, though we really only use the template rendering from django.
-
asked by mikerowan
right now it's just returning the most recent X 'preferences' (ratings). so it could be any amount of time depending on how active your friends are. you can also restrict it to only returning activity on a specific item if you want more depth on one thing.
-
We all had some some amount of academic background, but there's a big gap between what makes a good ML paper and what makes a good recommendation product :)
-
We have a pretty standard relational model for most of our data.
The major non-relational data for us is the taste graph, which, as its name implies is a graph structure. -
It really depends on the type of requests and the read/write break down of them. Requests that write data are more expensive, generally, than requests that just read data.
Writing data frequently involves synchronizing and persisting data in a db which is usually the least scalable part of your app. Reading data can be parallelized through reading data from caches or db replicas and so reads are much cheaper for us. -
asked by joshdance
We all code on macs and/or linux machines we ssh into. I can't imagine doing dev on windows. It's great being able to run pretty much the same software stack on my laptop as well as on a linux box.
-
We don't use R (or Matlab or Octave). Mostly just use python as the scripting glue to call things like numpy, cvxopt or blas.
-
asked by corydarby
We use svn, which isn't really a statement about our philosophy on revision control. We like git as well, but it's just not what we ended up picking.
We try to avoid branching if at all possible. We encourage everyone to commit often and to develop large projects as a series of small self-contained changes that can quickly go out to the production site. We typically update the site a few times a day. -
asked by corydarby
I'm not sure. We're open to something like Cassandra, but so far MySql has been good enough.
-
asked by corydarby
Nope. I've heard that it's join performance is pretty poor. I think it's designed to handle workloads that focus more on simple queries, inserts and updates.
-
We try to break things down into small projects. A typical project might last from part of a day to at most a week. Generally you can't tell if you're behind on an N day schedule until N/2 days have elapsed, so we like short project durations.
We like to hire programmers who can also serve as mini-project managers for what they're working on. So we rely on their good sense over building incredibly detailed specifications of what to do.
We also like to create projects that require little coordination with anyone else. Like any good parallel algorithm, coordination and communication costs ultimately limit how closely N people come to doing N times more work than a single person.
Finally, we always ask people to ultimately come up with their own schedules. No one believes in or will commit to a schedule that is forced on them. We don't worry about people taking too long. If anything most good developers are prone to over estimating how much they can get done, not under estimating.
-




Loading...