Meetings:2010-06

From GTALUG

Meetings:2010-05 - Meetings:2010-07

Tuesday 8th June, 2010 at 7:30 pm

Topic

Distributed Databases with Chris Browne

Description

(setf presentation 
      '("Distributed Databases"
	("Introduction" "GTALUG"
	 "2010-06-08"
	 "Christopher Browne")
	("Goals"
	 "Tolerance of failures"
	 "Improve Performance"
	 "Multiple Cheap Boxes (are they???)" 
	 "Improve network locality (rocketstick?)")
	("Performance"
	 "++ Iff Embarrassingly parallel"
	 "- Join across nodes: Slow!  Inconsistent!!!"
	 "- Communications costs worsen things"
	 "+ OK answer: Defer some work"
	 "- Adding RAM/CPUs is pretty successful")
	("ACID" 
	 "Atomicity" 
	 "Consistency"
	 "Isolation"
	 "Durability"
	 "Mighty heavy in all its glory") 
	("BASE"
	 "Basically Available" 
	 "Soft State"
	 "Eventually Consistent"
	 "Requires application rewrite!") 
	("BASE techniques"
	("Message Queueing" 
	 "AMQP"
	 "Durable versus nondurable queues"
	  "DBus"
	 "Spread")
	("Partitioning" 
	 "Range based"
	 "List Based"
	 "Hash based")
	 ("Improving OLTP"
	     "OLTP"
	     "Dumb Replica"
	     "Memcache"))
	("Alternative DB Models" 
	 "RDBMS"
	  "BigTable"
	  ("Document DB" 
	   "CouchDB"
	   "PICK (1965)"
	   "LDAP?"
	   "Lotus Notes/Domino")
	  ("DBM"
	   "SleepyCatDB"
	   "Voldemort"
	   "Tokyo Cabinet"
	   "create table dbm(name text PK,value text);")
	  ("Hierarchical DB" 
	   "MongoDB"
	   "CODASYL"
	   "LDAP?")
	  "Cache DB (memcached)"
	  ("Graph Databases"
	   "Neo4J"
	   "HyperGraphDB"
	   "Jena"))
	("Alternative Perspectives"
	 ("Transactional?" 
	  "Red Herring"
	  ("SQL without transactions"
	   "MySQL + MyISAM"
	   "MS Access + Jet")
	  ("Hashing with Transactions" 
	   "SleepyCat DB"
	   "Amazon Dynamo"))
	 ("HotSync" 
	  "Remember Palm?"
	  "Bloated Goats?"
	  "Git Repo"
	  "DropBox"
	  "Sync Against The Cloud"
	  ("Android" 
	   "Contacts"
	   "Email"
	   "AppBrain"
	   "Shuffle/Tracks (GTD)"
	   "WordPress"
	   "Libra (exercise client)")
	  "CouchDB/Desktop Couch")
	 ("Computing Models" 
	  ("Map/Reduce"
	   "Relational"
	   "Event Processing"))
	 ("CAP Theorem" 
	  "Consistency"
	  "Availability"
	  "Partition Tolerance"
	  "Pick a maximum of TWO"
	  ("Two camps" 
	   ("Traditional Database Camp" 
	    ("Optimistic locking"
	     "Sharding/Partitioning"
	     "Cacheing"
	     "Consistency is a 'raison d'etre'")
	    ("Nondatabase Camp" 
	     ("Use Files until they shatter"))))))
	("Myths" 
	 ("Transactions are expensive"
	  "SQL = Transactions"
	  "RDBMS doesn't scale" 
	  ("Actually, they scale mighty well.  It's not cheap, but they do harness big servers well."
	   "Nobody here is creating Facebook or Google.  You probably don't need to scale that way."
	   "Is the right answer to push servers everywhere, or is it cheaper to buy a cellular access point?"
	   "Recent benchmark: Postgres with fsync off beat everything else")
	  "NoSQL doesn't have schemas" 
	  ("In Soviet Russia, Database Imposes Schema on You!"
	   "create table nosql(key text primary key, value text);"))
	 ("NoSQL is simpler/easier... but..." 
	  ("You'll need to do consistency by hand"
	   "You'll need to do concurrency control by hand"
	   "Yeah, you never need to upgrade the schema.  But how are you sure you updated data to the new logical structure?"
	   "If you suck at SQL, what realistically indicates you'll suck LESS at something less structured?"
	   "Bad reason for NoSQL" 
	   ("Bad programmers"
	    "'SQL is too hard!'"
	    "NoSQL is faster, and simpler, and easier"
	    "... Until you need the things SQL was providing"
	    "Bad programmers are all over the New! Hot!!! thing"))))))

(defun generate-graphviz (number pname parent childlist)
  (with-open-file (gv (format nil "~A-~4,'0D.dot" pname number) :direction :output :if-exists :supersede)
    (format gv "digraph ~A_slide_~D { size=\"11,8\";~%" pname number)
    (format gv " title [label=\"~A\", shape=record];~%" parent)
    (loop for child in childlist
	  for i from 1 by 1
	  do (format gv " child~4,'0D [label=\"~A\", shape=record];~%" i child))
    (loop for child in childlist
	  for i from 1 by 1
	  do (format gv " title -> child~4,'0D;~%" i child))
    (format gv "}")))



(defvar slidenumber 0)
(setf slidenumber 0)

(defvar current-tree 1)
(setf current-tree presentation)

(defun generate-slides (current-tree)
   (let ((title (first current-tree))
	 (items (get-first-level (rest current-tree))))
     (format t "Slide ~D for: ~A~%    ~A~%~%" slidenumber title items)
     (generate-graphviz slidenumber "distributed_databases" title items))
   (incf slidenumber)
   (loop for subtree in (rest current-tree)
	 when (not (stringp subtree))
	 do (generate-slides subtree)))

Some Interesting Links

Location

TBA, but will likely be:

   Room GB248, Galbraith Building, University of Toronto
   35 St George St
   Toronto, Ontario M5S 3G8 
   University of Toronto

OpenStreetMap and Google Maps

Schedule

  • 6:00 pm - There is a get together of GTALUGers at Pho 88 restaurant 270 Spadina Ave (South of Dundas) for food and socializing.
  • 7:30 pm - Meeting and presentation.
  • 9:00 pm - After each meeting (at 9:00 pm) a group of GTALUGers move to the GSU Pub for beer and more socializing.

Photos

Customize