Forgot your password?
typodupeerror
Databases

Ask Slashdot: Which NoSQL Database For New Project? 272

Posted by Soulskill
from the mo-sql-mo-problems dept.
DorianGre writes: "I'm working on a new independent project. It involves iPhones and Android phones talking to PHP (Symfony) or Ruby/Rails. Each incoming call will be a data element POST, and I would like to simply write that into the database for later use. I'll need to be able to pull by date or by a number of key fields, as well as do trend reporting over time on the totals of a few fields. I would like to start with a NoSQL solution for scaling, and ideally it would be dead simple if possible. I've been looking at MongoDB, Couchbase, Cassandra/Hadoop and others. What do you recommend? What problems have you run into with the ones you've tried?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Which NoSQL Database For New Project?

Comments Filter:
  • by tubs (143128) on Wednesday April 09, 2014 @05:17AM (#46702801)

    Do you need a database to do what you're trying to do? Why not just write the information to a text file (csv or tab seperated?), and use other programs to query the data?

  • by Anonymous Coward on Wednesday April 09, 2014 @05:23AM (#46702825)

    You might want to consider a SQL database.

  • by Anonymous Coward on Wednesday April 09, 2014 @05:27AM (#46702831)

    Definitely use a CSV or tab-separated file. A NoSQL database is wayyyy overkill. Even a SQL database is overkill for what you're trying to do.

  • NoSQL? (Score:5, Insightful)

    by aaaaaaargh! (1150173) on Wednesday April 09, 2014 @05:29AM (#46702839)

    I would like to start with a NoSQL solution for scaling

    And there it is, the proverbial premature optimization ...

  • light (Score:4, Insightful)

    by invictusvoyd (3546069) on Wednesday April 09, 2014 @05:33AM (#46702853)
    SQLite is a relational database management system contained in a C programming library. In contrast to other database management systems, SQLite is not a separate process that is accessed from the client application, but an integral part of it.
  • by tonywestonuk (261622) on Wednesday April 09, 2014 @05:34AM (#46702855)

    "I'll need to be able to pull by date or by a number of key fields"

    So, in other words, you have already decided on key fields. If you use a database, this has things call index's, that can search billions of rows for a key field in a fraction of a second.
    If you don't use something with INDEX's then you can't do this.

    Where has this idea that Databases can't scale come from? - The world runs on Database for heaven sake. Do you think when you take money out of an ATM, its going to MONGODB? - And yet there are millions of ATM's and you can take money out of your VISA account in almost all of them anywhere in the world. That is called scale.

  • by cyber-vandal (148830) on Wednesday April 09, 2014 @05:35AM (#46702867) Homepage

    Where has this idea that Databases can't scale come from?

    Salesmen

  • Re:Use PostgreSQL (Score:5, Insightful)

    by Lennie (16154) on Wednesday April 09, 2014 @05:37AM (#46702873) Homepage

    Yes, that is what I would wanted to point out too.

    Also in PostgreSQL 9.4 it has jsonb which is, in certain tests less than a year ago, faster than MongoDB.

  • MariaDB (Score:2, Insightful)

    by Anonymous Coward on Wednesday April 09, 2014 @05:39AM (#46702881)

    I would consider using the latest release of MariaDB.

    You can use it as a standard MySQL server, but they also have Cassandra NoSQL as an engine for it now (since the release of 10)... So you would be easily able to play with things on different database types and see what suits your situation better.

  • by mwvdlee (775178) on Wednesday April 09, 2014 @05:49AM (#46702927) Homepage

    Basically the question is; what's the expected volume of records and fields per records?

    A solution for 100 records a week with 4 fields each would be different from 1000 records per second with 30 fields each.
    1000 records/sec with 4 fields would be yet another solution.

  • Just Use SQL (Score:5, Insightful)

    by Anonymous Coward on Wednesday April 09, 2014 @05:52AM (#46702939)

    I just felt I have to comment on this. So many developers start with the phrase "I need NoSQL so I can scale" and almost all of them are wrong. The chances are your project will never ever ever scale to the kind of size where the NoSQL design decision will win. Its far more likely that NoSQL design choice will cause far more problems (performance etc), than the theoretical scaling issues.

    Take for example two systems I've been involved with for managing WiFi access to large scale networks (100,000+ concurrent users, 1000's of APs), one uses MongoDB the other based on PostgresSql. The MongoDB based solution has very real performance problems, its reporting takes a very long time to run taking very large amounts of system ram (24G in some cases) and that performance is only degrading as the system grows, there are also many other performance issue. These issues are not just mongo issues but simply that NoSQL is not well suited to the task. The system has been rewritten using an SQL backend and now works much better but importantly it's scaling but better. Growth in the system is no-longer degrading performance and the point where we need hardware upgrades or extra servers etc are now much more predictable so we can predict cost base growth in relation to user growth.

    NoSQL does not guarantee scaling, in many cases it scales worse than an SQL based solution. Workout what your scaling problems will be for your proposed application and workout when they will become a problem and will you ever reach that scale. Being on a bandwagon can be fun, but you would be in a better place if you really think through any potential scaling issues. NoSQL might be the right choice but in many places I've seen it in use it was the wrong choice, and it was chosen base on one developers faith that NoSQL scales better rather than think through the scaling issues.

  • by FyRE666 (263011) on Wednesday April 09, 2014 @06:07AM (#46702979) Homepage

    Please don't do this (use a flat file) to store data for a web app that's likely to be accessed by more than one device at a time. Unless you implement your own file locking mechanism, you'll eventually end up with corrupt entries. Even if you do implement your own locking scheme, it's probably not going to be as efficient as using a DB. It's a 5 minute job to set up a new MySQL DB and associated query to push data in, then you can filter and report on it much more easily. It's something DBs are very good at!

    Unless you have a specific need to scale horizontally, it's generally better to stick with a SQL DB for web apps. I've used MySQL, PostgreSQL and Oracle for this. MySQL is by far the easiest to work with, hence its popularity. I don't actually know of any advantage to using PostgreSQL; it doesn't perform any better, and is (or at least used to be) much less user friendly.

  • by Richard_at_work (517087) <richardprice@@@gmail...com> on Wednesday April 09, 2014 @06:43AM (#46703099)

    I think many people get stuck in thinking "one single database, thats it, my initial decision condemns me forever", when in-fact theres no shame in having many databases.

    Stick the raw data into one database, choose the database that suits that.

    Transform the data from the raw database into something you can use day to day, thats well structured etc, choose the database for that.

    Transform the data from the day to day schemas into something that more suitable for archiving and long term reporting, again choose the database for that.

    You don't have to have one single database type, every particular one has its strengths, so use them!

  • Big mistake (Score:5, Insightful)

    by msobkow (48369) on Wednesday April 09, 2014 @06:58AM (#46703139) Homepage Journal

    Telecommunications data is eminently suitable to schema table storage in any relational database, which with a little work, will let you index by the keys you intend to query by.

    NoSQL solutions are better for unstructured data that doesn't come in predictable formats or value sets.

    You need to take a step back and look at the problem before you decide on a solution. Don't be one of those idiots who tries to use a hammer to drive a screw.

  • by OzPeter (195038) on Wednesday April 09, 2014 @07:00AM (#46703147)

    Based on your information no one can give you solid advice.

    IMHO the question is deliberately designed to be vague. iPhones and Android devices, PHP and Ruby On Rails .. that is such a shotgun blast of specifications that are totally unrelated to the DB use on the back end that the entire question smells of click bait to me.

  • by janoc (699997) on Wednesday April 09, 2014 @07:15AM (#46703191)

    Databases don't scale for people who don't understand SQL, don't understand data normalization, indexing and want to use them as flat files. Unfortunately, a way too common anti-pattern :(

    The second group are too-cool-to-learn kids using the latest development tool fad on the market to build yet another Facebook/Twitter/Instagram/whatever clone ...

  • Re:NoSQL? (Score:5, Insightful)

    by Sarten-X (1102295) on Wednesday April 09, 2014 @08:00AM (#46703397) Homepage

    As an expert (relative to most of Slashdot) in NoSQL databases, with a significant amount of experience in Hadoop and HBase systems, I agree wholeheartedly.

    NoSQL solutions can be ridiculously fast and scale beautifully over billions of rows. Under a billion rows, though, and they're just different from normal databases in various arguably-broken ways. By the time you need a NoSQL database, you'll be successful enough to have a well-organized team to manage the transition to a different backend. For a new project, use a RDBMS, and enjoy the ample documentation and resources available.

  • by khchung (462899) on Wednesday April 09, 2014 @08:07AM (#46703435) Journal

    Based on your information no one can give you solid advice.

    IMHO the question is deliberately designed to be vague. iPhones and Android devices, PHP and Ruby On Rails .. that is such a shotgun blast of specifications that are totally unrelated to the DB use on the back end that the entire question smells of click bait to me.

    Either that, or the OP simply have no idea how databases work at all.

    If OP has any idea how database (any database, not just relational) works, he would be talking about data and transaction volumes, access patterns, transactional requirements, data integrity constraints, retention and housekeeping requirements, etc.

    Instead, as you said, he talked about devices platforms, communication protocols, language and runtime environment which are all irrelevant to choosing database. (ok, the last may be a bit relevant depending on which database used)

  • Re:NoSQL? (Score:2, Insightful)

    by Anonymous Coward on Wednesday April 09, 2014 @08:29AM (#46703545)

    As an expert (relative to most of Slashdot) in NoSQL databases, with a significant amount of experience in Hadoop and HBase systems, I agree wholeheartedly.

    NoSQL solutions can be ridiculously fast and scale beautifully over billions of rows. Under a billion rows, though, and they're just different from normal databases in various arguably-broken ways. By the time you need a NoSQL database, you'll be successful enough to have a well-organized team to manage the transition to a different backend. For a new project, use a RDBMS, and enjoy the ample documentation and resources available.

    Agreed. I used a NoSQL database on a project I'm working on at the moment, and stick by that decision even though I don't even have millions of row, but my situation is somewhat different to the OP's: my data model is very difficult to map to SQL (I have hundreds of different entity types, each of which has different field storage requirements, and need to be able to associate between entities of different types according to a variety of rules, meaning that some entity types may have hundreds of different types of entity associated with them; SQL quite simply sucks for this kind of data, but thankfully applications where you end up with this kind of data are few and far between). OP's data sounds like an ideal candidate for storage in a relational database; he has one basic entity type, no need to make any kind of connection between entities, and apparently no complicating factors at all.

  • Re:Short Intro (Score:3, Insightful)

    by Brian Nelson (3610471) on Wednesday April 09, 2014 @08:58AM (#46703703)
    And any text file can be transnational if you write your code right. We can keep going down this road about how you don't /need/ X technology, but nobody wins. It's really OK to see the good in different technologies.
  • by boristdog (133725) on Wednesday April 09, 2014 @09:19AM (#46703839)

    As someone who is currently trying to convert a 20 year-old, multi-million-entry flat files DB into a real DB for a major corporation without bringing the corporation to its knees I heartily concur with NOT using flat files if there is ANY chance of this growing beyond a few hundred entries.

    By now hundreds of applications are using the old flat file DB, I have so much re-coding to do that I will probably retire before it is all complete.

  • by squiggleslash (241428) * on Wednesday April 09, 2014 @09:21AM (#46703861) Homepage Journal
    Then perhaps he should use a real database, rather than embrace a fad started by people who don't like databases?
  • by luis_a_espinal (1810296) on Wednesday April 09, 2014 @09:35AM (#46703991) Homepage

    I would like to start with a NoSQL solution for scaling,

    This is a solution looking for a problem. Or more precisely, you are looking for an excuse to use a piece of technology or paradigm. Don't get me wrong, your systems requirements might indeed be best served using a NoSQL solution, but what exactly has your analysis shown regarding this?

    Scaling is not just a technical feature (NoSQL, SQL, Jedi mind-meld tricks). Scaling is a function of your architecture. You can NoSQL the shit out of your solution, but if your software and system architecture is not scalable, then having NoSQL will mean chicken poop as solutions go.

    and ideally it would be dead simple if possible.

    If you want simple, put a simple RDBMs schema (a properly normalized that) in place, and have your code use a simple, technology-agnostic persistence layer that maps your domain-level artifacts to database artifacts. If you ever had to replace the back-end, then you can do so with minimal changes to the API that domain-level artifacts use to persist themselves with the persistence layer.

    Design your domain solution around domain-specific artifacts. Persistence technology is typically a low-level design/implementation detail, an important one obviously (and a critical one for some classes of systems).

    But for what you are describing, the choice shouldn't even be coming into the picture without first having an architectural notion of your solution.

  • by Bacon Bits (926911) on Wednesday April 09, 2014 @11:33AM (#46705089)

    God forbid someone make them think about their data structures and how the end user might need to query them with their own reports.

What the large print giveth, the small print taketh away.

Working...