[GTALUG] Good introduction to Kibernetes

Christopher Browne cbbrowne at gmail.com
Fri Jun 9 16:57:25 EDT 2017


On 9 June 2017 at 16:37, Myles Braithwaite via talk <talk at gtalug.org> wrote:
> <https://jvns.ca/blog/2017/06/04/learning-about-kubernetes/>
>
> I remember someone at a GTALUG meeting asking about Kubernetes (a new
> container thing).
>
> This is a really awesome introduction to what Kunernetes is and links to
> some good resources for more information.

Pretty good, indeed!

Last year, I grabbed notes at a session on containerization at the
PGCon UnConference; I'll attach them.  These notes speak to the
implications of running a database (e.g. - PostgreSQL) inside
containers.  The fact that databases accrete data that you might want
to have survive a while means that using them are a bit different.

An interesting bit of "dogma" for container usage is that you're
supposed to generally run a single process in a container, and get
logs by virtue that the process spits its activity to STDOUT.

Another dogma is that you're supposed to manage configuration either by:
 a) Talking to the service, and having it manage itself.
     For PostgreSQL, this is fine in many cases, but there are some exceptions
 b) Restarting the service with new configuration files

 For PostgreSQL, there are some things you can only do by logging onto
the server, mussing with files, and restarting the database
service/server.  Kubernetes doesn't think that's something you can do.
(I expect that it doesn't generally have a tty that you can remotely
log into.)

*** Containers
**** Benefits
 - Dev to Prod is eased
   - Containers contain the same stuff; just fiddle with the config
 - Tools for containers are pretty good
   - Flannel - [[http://coreos.com/flannel][Flannel - software defined
networking]]
     - For defining virtual network to give a subnet to each host for
container runtimes
   - Kubernetes  - [[http://kubernetes.io][Kubernetes.io]]
     - for grouping containers
 - Need to understand cgroups
 - Practices and policies
   - makes hot-fixes more or less impossible, which tends to be a FEATURE
 - You don't run yum, apt-get upgrade, you don't ssh into it to fix things
   - the /proper/ solution is to recontainerize
**** Downsides
 - Doesn't handle schema migrations
   - All you get is fresh containers
 - SQL; you need more MB of material to ship around seeding data or more
 - Less isolation than with VMs
   - Shared kernel means lots more sharing between containers
   - Goal at RHAT is to lock things down more in containerization
 - Storage management is needful
   - Inside container is bad; you want to kill and restore containers,
so data won't live across invocations
   - Separate mounted filesystem needs to be managed, identified, and
attached, so more configuration to manage
 - Some things not/quasi-not doable
   - Managing ~postgresql.conf~ used to be a problem, but we can now
manage it all via talking to DB
   - Managing ~pg_hba.conf~ and ~recovery.conf~ may be troublesome though
   - Can't restart postgres
   - Can't promote replica
   - How do we get at logs?
     - Standard container thing is to submit to stdout, not friendly
     - Formatting options require log collector, which can't go to stdout
     - Too many things are only available via logs
       - Is replication working (9.5 helped...)
       - Some error conditions only reported in logs
     - Solutions
       - Shared filesystem somewhere
       - syslog to rsyslog
       - Apache Kafka or such service
**** Dealing with containers
 - You have an image server
 - Security updates leads to creating new images
 - Need to ensure it's easy to compose new images mixing together
desired "stuff"
   - Orchestrator used to compose new images
     - Misos Marathon
     - Docker Swarm
     - Kubernetes
 - Docker added
   - layered images via UNION filesystem
     - Layer 1 - Fedora :: Basic
     - Layer 2 - PostgreSQL :: PG
     - Layer 3 - PG Analytics ::
     - Layer 4 - Replica ::
     - Layer 5 - Orchestrator injects host-specific stuff
       - Authorization
       - Environment variables
**** PostgreSQL Usage cases
 Wins come in cases like...
 - Doing minor version upgrades of PostgreSQL quickly
 - Seems like a good way to deploy tiny PostgreSQLs for something like pg_paxos
 - For a PostgreSQL upgrade, run a container with 9.5 and 9.6 and pg_upgrade
   - Shut down 9.5 container
   - Start 9.5-to-9.6 container to do upgrade, complete, shut down
   - Start 9.6 container against instance
   - Manage this via an Ansible process, not Kubernetes
**** Ways of running Postgres in containers
 - Statefulness issues
   - Cannot easily migrate containers at zero cost, that is quite noticed
     - Kubernetes 1.3 is adding things to handle stateful containers
     - Flocker is apparently another thing created for this, but not
well integrated with other orchestrators
 - Treat databases as ephemeral
   - Small DB backing web app is easy to get spun up again
   - Local storage on the host container filesystem
   - You need to have enough configured to re-replicate databases from
their sources
 - Large DB instances that should not be ephemeral
   - Use network-based storage
   - If HA, then have storage be SEF
   - Use what Kubernetes calls "persistent volumes" for the PGDATA directory
   - Fresh instances go through crash recovery

-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


More information about the talk mailing list