Forgot your password?
typodupeerror
Databases Books Businesses Education

Good Database Design Books? 291

Posted by timothy
from the this-slot-that-tab dept.
OneC0de writes "I am the Director of IT for a small/medium sized marketing company, where I personally write the code that runs our applications. We use a variety of technology at our office, the majority of which rely on MS-SQL and MySQL databases. I am familiar with tables, SQL queries, and have a general understanding of how the SQL databases work. What I'm looking for is a good book, particularly a newer book, to explain general database design techniques, and maybe explain some relational tables. We have some tables that have million of rows, and I'd like to know the best method of designing these tables."
This discussion has been archived. No new comments can be posted.

Good Database Design Books?

Comments Filter:
  • A Few Suggestions (Score:5, Interesting)

    by eldavojohn (898314) * <eldavojohn@nOspAM.gmail.com> on Thursday July 08, 2010 @06:40PM (#32845484) Journal

    We have some tables that have million of rows, and I'd like to know the best method of designing these tables.

    I'm a developer, not a database expert. But it seems that every now and then I have to get my hands dirty with data modeling. "The best method" is probably a really vague concept. If you have serious hardware constraints than the best method changes from an easily maintainable system to something more complex. There's give and take in database design and I guess a million rows is really something that a traditional relational database should be able to handle. So I'd suggest any book that teaches data modeling will suit you here. I happened to learn on Data Modeling Essentials [amazon.com] which was decent but not great. I have heard good things about Len Silverston's growing series that concentrates more on patterns. But really what you're going to want is a book on data modeling or analysis that teaches you the orders of normal form, when to use cross reference tables, etc so you can get a better idea of good modeling standards. At a million rows, you might not find the need to refactor if you read about the new best practices but perhaps you could make a business case to eventually migrate.

    Now there are other topics that require entirely separate books because they are such a diverging path from relational databases. It's not common but your database can be based on something other than an object or table [wikipedia.org]. If you consider the internals of Google, perhaps BigTable is the most prolific database implementation out there and while interesting [google.com], it is sort of a very specific proprietary database implementation. You could take this approach to tailor your company's database to be precisely what you need but this would clearly be overkill in your case. You don't talk about any bottlenecks or impending loads that need to be carefully considered so instead of treading down this path, I suggest you first take a course on MySQL or get the de facto book on whatever database you use [amazon.com] and play around with fine tuning on a test system. A lot of DBs out there allow you to tune them through a configuration file so that your particular needs are met more closely. If you're looking for this sort of continuing education just out of curiosity, pick up a book on database design and start to tinker. But it requires a lot of knowledge and effort to start a database technology from scratch and compete with vanilla out of the box technologies like MySQL and PostgreSQL.

    From what information you provide in your question, I'd suggest this book to help you understand database designs more via industry proven patterns [amazon.com]. That assumes you have all the basic database design practices covered.

    • by OneC0de (1851710)
      Thank you for the great and informative reply!
    • Re:A Few Suggestions (Score:5, Informative)

      by RobertM1968 (951074) on Thursday July 08, 2010 @06:54PM (#32845624) Homepage Journal

      We have some tables that have million of rows, and I'd like to know the best method of designing these tables.

      I'm a developer, not a database expert. But it seems that every now and then I have to get my hands dirty with data modeling. "The best method" is probably a really vague concept. If you have serious hardware constraints than the best method changes from an easily maintainable system to something more complex. There's give and take in database design and I guess a million rows is really something that a traditional relational database should be able to handle. So I'd suggest any book that teaches data modeling will suit you here.

      eldavojohn makes some excellent points and gives some great suggestions. Keep in mind, like elda suggests, nothing is cut and dry. Configuration, resources, numbers of connections for specific data, etc; all will have an impact (or should) on what you should do and how you should design.

    • Re:A Few Suggestions (Score:5, Informative)

      by hguorbray (967940) on Thursday July 08, 2010 @07:02PM (#32845708)
      http://www.amazon.com/Case-Method-Entity-Relationship-Modelling/dp/0201416964

      I used this book at Foothill college in an intro to data management class and it taught me more than any of the dozen oracle classes I took once I got past the terminology of tuples, etc

      this one is also well-recommended:
      http://www.amazon.com/Database-Systems-Design-Implementation-Management/dp/0760049041

      and this one is good for people without dba or architect background:
      http://www.amazon.com/Database-Design-Mere-Mortals-Hands/dp/0201752840/ref=sr_1_1?ie=UTF8&s=books&qid=1278629171&sr=1-1

      I would stay away from the vendor specific books as good database design s/b dbms agnostic

      -I'm just sayin'
      • Re: (Score:3, Informative)

        I agree with the CASE Method ER book, Barker is the king of data modeling. In the book he walks through some real world scenarios (airline ticketing, manufacturing bill-of-materials) that are fundamental to relation databases.

        You may find some implementation differences with SQLServer, like not using cursors (a common pl/sql construct in Oracle) and some limitation to using "join on" SQL syntax, but I used the book when I went from writing single user applications to enterprise apps.

        Whatever book you get, e

    • Information Modeling and Relational Databases [amazon.ca]

      It's not just another database modelling book. It discusses in depth the object-role modelling (ORM) [wikipedia.org] method of taking human ways of talking about data and turning it into a really well-modelled database.

      I stumbled across this book when looking for new data modelling books, and I was deeply impressed with the method.

    • Re: (Score:3, Insightful)

      by t33jster (1239616)

      "The best method" is probably a really vague concept.

      I disagree completely. The "best" method varies widely, because it is specific to the RDBMS you're using.

      OP says he understands how databases work, but it seems to be limited to how to put data in & get data out. A database is (or can be) more than a bit bucket. If it's taking too long to fetch records from a 10 million record table, there are some serious performance issues here. It could be any combination of bad data modeling, improper indexing, underpowered hardware, poorly configured concur

  • Database in Depth (Score:5, Informative)

    by Anonymous Coward on Thursday July 08, 2010 @06:47PM (#32845550)

    Database in Depth: Relational Theory for Practitioners
    Publisher: O'Reilly Media; 1 edition (May 1, 2005)
    Language: English
    ISBN-10: 0596100124
    ISBN-13: 978-0596100124

    Best DB book i have ever owned/read/seen!

    • Re: (Score:3, Informative)

      by Anonymous Coward

      I agree wholeheartedly that Database In Depth is one of the best DB books in print - but would recommend for this reader instead Date's slightly later book SQL and Relational Theory, which replaces the Tutorial D examples with SQL and goes more in depth into how to use SQL relationally.

    • Re: (Score:3, Informative)

      That's been updated and replaced by SQL and Relational Theory: How to Write Accurate SQL Code [oreilly.com] .
  • O'Reilly (Score:4, Informative)

    by JWSmythe (446288) <jwsmytheNO@SPAMjwsmythe.com> on Thursday July 08, 2010 @06:49PM (#32845566) Homepage Journal

    O'Reilly books are your friend. The "... in a Nutshell" books are a good place to start, and then proceed into the more advanced books. They have 25 titles related to MySQL [oreilly.com] and 53 titles related to Microsoft SQL [oreilly.com]. There are usually a few to browse through at the large chain book stores.

    • by i.r.id10t (595143)

      And, your local library may have a Safari subscription available for you, so you can get them *all* online (mine does, aclib will in theory cover anyone in Florida...)

  • A director that still codes? What a novel concept. Good for you.

    • is it ? reallllllly ?

      • by kramulous (977841)

        My assistant director and director barely know how to turn on a machine.

        Not that they really need to know; Their job is more of people management. Making sure that departments talk to each other (and nicely) and that friendship bonds are formed and not destroyed. Conflict resolution, etc.

        That is why I was surprised.

  • by obarthelemy (160321) on Thursday July 08, 2010 @06:54PM (#32845640)

    I'm a bit unclear about what you want to achieve:
    - easier end-user interface
    - more reliability (backups, journalling, redundancy...)
    - more speed
    - more security
    - more complicated data massaging (multi tables, statistics...)
    - better vizualization (reports, graphs...)

    I'm not sure a single book can cover all that.

    • Re: (Score:3, Funny)

      by itwerx (165526)

      Here, let's just give him some answers:
      - normalize everything for consistency
      - denormalize everything for performance
      - index only key fields for performance
      - index everything for performance
      - date index everything for logging purposes
      - don't date index anything for performance reasons
      - sanitize your inputs at the db level instead of the client for security and performance
      - sanitize your inputs at the client level instead of the db for security and performance
      - use Postgresql because MySQL sucks
      - use MySQL b

  • by cduffy (652) <charles+slashdot@dyfis.net> on Thursday July 08, 2010 @06:59PM (#32845678)

    I'm not sure I'd trust a book to teach this subject as comprehensively as a good university course on the subject. Frequently, you can sit a class quite inexpensively if you're not going for credit.

    For that matter, isn't MIT or someone allowing free not-for-credit access to their eLearning materials?

    • I couldn't agree more. Database design isn't something you should learn from a book. You CAN of course, and it might work out, but you'd be much smarter to just take a class.

      Let me put it this way. Programming is really mostly about data structures. Database structure tends to live for DECADES. Screw up the initial design, and you'll be hurting the business for decades to come. This isn't something to be taken lightly. A good DB design can pay off huge returns in the future when you have to add featu

  • by 8282now (583198) on Thursday July 08, 2010 @07:03PM (#32845710) Journal

    IMHO: Joe Celko's SQL for Smarties (http://www.amazon.com/Joe-Celkos-SQL-Smarties-Programming/dp/0123693799/ref=sr_1_2?ie=UTF8&s=books) has shown itself to be very nice book when the need to go beyond the basics to a little deeper understanding of SQL is needed.

    There are many other books on the subject all the way to source material from Date and Dodd but Celko seems to be well informed and writes fairly well, I think.

    • by Jason Earl (1894)

      I was curious to see how far down I would have to read before Celko was mentioned. That's the book I would recommend.

  • Basic relational database design is about logic and structure. When compared with other areas of computing, I would argue that the the original materials worked out by Codd and Date have not changed nearly as dramatically. There are certainly exceptional sub-areas where there have been major changes (e.g. the introduction of the object model and development of XML and RDF, to name but two prominent examples), but if I were you, I would suggest doing two things:

    1. Do some research into existing relational
  • by luis_a_espinal (1810296) on Thursday July 08, 2010 @07:09PM (#32845760) Homepage

    "I am the Director of IT for a small/medium sized marketing company, where I personally write the code that runs our applications. We use a variety of technology at our office, the majority of which rely on MS-SQL and MySQL databases. I am familiar with tables, SQL queries, and have a general understanding of how the SQL databases work. What I'm looking for is a good book, particularly a newer book, to explain general database design techniques, and maybe explain some relational tables. We have some tables that have million of rows, and I'd like to know the best method of designing these tables."

    There is more to RDBMS than tables and SQL. Your developers should understand data normalization first and foremost, at least 1NF, 2NF and 3NF.

    http://en.wikipedia.org/wiki/Database_normalization [wikipedia.org]

    http://en.wikipedia.org/wiki/First_normal_form [wikipedia.org]

    http://en.wikipedia.org/wiki/Second_normal_form [wikipedia.org]

    http://en.wikipedia.org/wiki/Third_normal_form [wikipedia.org]

    The examples in the URLs above should suffice for getting a general understanding on how to start with a relational model. As for books, I'd suggest these:

    http://www.amazon.com/Relational-Database-Design-Implementation-Third/dp/0123747309/ref=sr_1_4?ie=UTF8&s=books&qid=1278630155&sr=8-4 [amazon.com]

    http://www.amazon.com/Information-Modeling-Relational-Databases-Management/dp/0123735688/ref=sr_1_3?ie=UTF8&s=books&qid=1278630306&sr=1-3 [amazon.com]

    I would also suggest C.J. Date's "Database in Depth: Relational Theory for Practitioners", but I can imagine the local penny arcade l33t-hax0r-wannabe crowd going batshit crazy about studying relational algebra and relational database theory in depth. To each his own. Most problems that arise in poorly designed relational database models arise from not understanding data normalization

    :

  • OMG (Score:3, Informative)

    by noz (253073) on Thursday July 08, 2010 @07:12PM (#32845786)

    If you are designing anything bigger than a couple of gigabytes, you are in for some fun (or your users are). ;-)

    To be a good designer, there is no substitute for a thorough understanding of the subject matter. And you are a self-confessed n00b. Get an expert. Or study. Hard.

    Database in Depth: Relational Theory for Practitioners [amazon.com].

  • by MattBD (1157291) on Thursday July 08, 2010 @07:24PM (#32845874) Homepage
    I did an exam on SQL and database design recently and used The Manga Guide to Databases as part of my studies. If you don't want something too rigorous it's very good indeed - I found it a lot better at making stuff sink in than a dry, stuffy book. It gives a reasonably good idea of things like the first, second and third normal forms. Don't be put off by the fact that it looks a bit childish - the storytelling idea really works well. It probably won't work for everyone, but it did work well for me (I passed the exam with flying colours).
  • Text Book (Score:3, Informative)

    by pgn674 (995941) on Thursday July 08, 2010 @07:35PM (#32845946) Homepage
    My university course on databases used the text book A First Course in Database Systems by Jeff Ullman and Jennifer Widom. I rather enjoyed the book, and plan to have it above my desk in case any sort of database design or maintenance project comes up for me. The book's page is here [stanford.edu]; links to purchase are at the bottom.
  • by meburke (736645) on Thursday July 08, 2010 @07:40PM (#32845984)

    ...and improve your quality and maintainability?

    Back in the 70's and early 80's we learned a methodology called, "Data Structured Systems Design" and the fundamental presupposition was that everything could be expressed logically and accurately by describing it as relationships in set theory. I have not seen anything since that surpasses the quality and maintainability of database applications and systems.

    Someone already mentioned Joe Celko's book "SQL for Smarties" and I would recommend you first read his, "Thinking in Sets" before any of his other books.

    I would also suggest some earlier books by Ken Orr and Jean Dominique Warnier. If you learn the Warnier-Orr approach to DESIGNING the system before doing any coding, you will reduce the time necessary for maintaining the system. I have seen hundreds of small IT shops like yours, and much of the time Systems Analysis and Design is neglected and performed "off-the-cuff" by programmers who can't wait to get to the coding. I didn't originally believe Ken Orr's assertion that spending twice as much time designing the system would result in a sharp time reduction for overall project completion, but through experience and observation I became a believer.

    • What books do you have in mind? I did a search for Ken Orr and Jean Dominique Warnier, and neither of them seemed to have a book that was 'best.'
  • ...and other like-minded groups. You're going to learn more from interactions with other DBA's than from any book. I'm a dev at a place that can measure db growth in TB/week, and have learned a tremendous amount just from working with DBA's in our organization.

  • I found "Database Design for Mere Mortals" (ISBN 0-201-69471-9 [wikipedia.org]) to be an good/easy entry point for good database design methodology.
    • by KFW (3689) *

      I second that. Not extremely technical, but a good first read about relational databases, normalizing, etc. /K

  • I work with databases a lot, and while I've read books on design, I don't think it's complex with a relational database since most of the design has already been done for you. The guiding principles I follow are simple:

    1) Don't over-complicate your database. Only store data that you plan on using or you need to store, stay away from adding unnecessary tables or data. Don't try to build a fancy user interface with lots of code unless you expect the database to be used by a layperson, and only build suc

    • by Tablizer (95088)

      Don't abbreviate your field names. A modern relational database can handle spaces and long names. But a modern user still can not guess what your abbreviation meant. Yes, it makes the code a little longer, but you're better off in the end.

      I have to disagree. Long column names result in the column headings in typical table browsers being truncated unless you make them too wide to be useful. And when you work with SQL directly, long names can make the code bloated and hard to read. Many RDBMS come with a desc

  • by Invisible Now (525401) on Thursday July 08, 2010 @08:23PM (#32846376)

    These three lessons may not all be in any one book, but they can help in the real world:

    1) Learn what SQL Injection is and how to defend against it. It will ruin your day and could severely damage your current employment situation.

    2) Abstract your schema from your front-end applications. Stored procedures are easy to write and can provide security and if well written stop injection attacks. They will let you change your database design without breaking your deployed apps. Just update the internal code in the P. Middleware and objects can do this, too.

    3) Bergstrom's law of sailing says: "You can get away with anything in less than 5 knots of wind." Similarly, any little box or blade with 2 to 4 gs of RAM can easily handle 5 to 10 million row tables. Dedicate the server to MySQL or MS SQL so they can cache and buffer efficiently and they will outperform much bigger boxes trying to run too many schemas and DBs concurrently. Learn to index. Don't be too puritanical about normalization. Returning a customer address should require 6 joins. And remember that moving that moving large recordsets across the LANWAN may take much more time than the server query.

    You probably already know all this... but maybe someone else reading this doesn't.

  • An Introduction to Database Systems, C.J. Date.

    He's like the grandfather of relational database systems. Quel truly is the language of the Gods.

  • The Folks from ErWin (Score:3, Informative)

    by CharlieG (34950) on Thursday July 08, 2010 @09:27PM (#32846802) Homepage

    Back in the day when they were their own company they used to recommend

    Designing Quality Databases with IDEF1X Information Models

    I found the book VERY informative

  • Check out this book (Score:2, Interesting)

    by BigThor00 (1691638)
    Check out Database Design for Mere Mortals... It's a pretty good book for beginning database design.
  • by ffoiii (226358)
    The Data Modeling Handbook : http://www.amazon.com/Data-Modeling-Handbook-Best-Practice-Approach/dp/0471052906/ref=sr_1_58?s=books&ie=UTF8&qid=1278645029&sr=1-58 [amazon.com] It is not new, but relational theory hasn't changed much in the last 20 years either. I have been designing, developing, implementing and fixing relational databases and data warehouses for the last 15 years. The book above was one of the most useful things I read early on in my career. In my opinion, data integrity is one of the mo
  • by bigtrike (904535) on Thursday July 08, 2010 @11:32PM (#32847372)

    Database Modeling and Design: Logical Design, 4th Edition. Its ISBN is 0126853525. It taught me a lot about how databases work "under the hood". If you want to know the performance implications of a b+ tree index vs. a b-tree, this book will help.

  • Open a database book's index. Look up "natural key". If the author presents natural keys as a viable alternative to surrogate primary keys, then the author doesn't know what he or she's talking about and you should move on to the next book. Especially if they pull the whole "meaning" schtick.

    On the other hand, if the author gives you great real-world examples as to why you should use surrogate primary keys over natural keys in ever possible situation, then you've got a winner.

  • Why new? (Score:4, Interesting)

    by sootman (158191) on Friday July 09, 2010 @12:27AM (#32847572) Homepage Journal

    Database fundamentals haven't changed much. I don't know how much you know so far but this guy is pretty smart:
    http://philip.greenspun.com/sql/ [greenspun.com]
    http://philip.greenspun.com/panda/ [greenspun.com]
    http://philip.greenspun.com/wtr/ [greenspun.com]

    Lots of the core stuff about RDBMSs goes back decades and even old stuff like this is still very relevant. Try reading this page [greenspun.com] (just a dozen printed pages) and see what you think. He covers a lot of the fundamentals well and his style of writing is pretty entertaining.

  • And (Score:2, Insightful)

    by mahadiga (1346169)

    Since you're Director of IT, I'd recommend you to start from http://en.wikipedia.org/wiki/Database_normalization [wikipedia.org]

  • A lot of responses are referencing some good books and you should give them due consideration.

    One post spoke of set theory. Definitely look into that.

    ...

    Now for the real world. DB Theory is great, it is a base of knowledge that needs to be obtained and digested. You should understand 1st norrnal form all the way to 6th normal form and that will take a bit to wrap your head around

    Now that you have accomplished that, it is time to start breaking those rules and get some work done. Databases do one thing

"Go to Heaven for the climate, Hell for the company." -- Mark Twain

Working...