Mobycast

Get the App

Mobycast

Jon Christensen

A Podcast About Cloud Native Software Development, AWS, and Distributed Systems.

41 minutes 13 seconds

Hands On AWS - Massively Scalable Image Hosting Using S3 and CloudFront - Part 2
In this episode, we cover the following topics:
- We discuss the features and limitations of serving files directly from S3.
- We then talk about how CloudFront can address many of S3's limitations. In particular, CloudFront is performant, inexpensive and allows us to use custom CNAMEs with TLS encryption.
- How to create a secure CloudFront distribution for files hosted in S3.
- What is OAI (Origin Access Identity), why we need it and how to set it up.
- We show how you can configure your CloudFront distribution to use TLS and redirect HTTP to HTTPS.
- We finish up by discussing "byte-range requests" and how to enable them for our image hosting solution.
Detailed Show Notes
Want the complete episode outline with detailed notes? Sign up here: https://mobycast.fm/show-notes/
End Song
Beauty in Rhythm by Roy England
More Info
For a full transcription of this episode, please visit the episode webpage.
We'd love to hear from you! You can reach us at:
- Web: https://mobycast.fm
- Voicemail: 844-818-0993
- Email: ask@mobycast.fm
- Twitter: https://twitter.com/hashtag/mobycast
- Reddit: https://reddit.com/r/mobycast
8 July 2020, 12:00 pm
43 minutes 25 seconds

Hands On AWS - Massively Scalable Image Hosting Using S3 and CloudFront - Part 1
In this episode, we cover the following topics:
- A common feature for web apps is image upload. And we all know the "best practices" for how to build this feature. But getting it right can be tricky.
- We start off by discussing the problem space, and what we want to solve. A key goal is to have a solution that is massively scalable while being cost-effective.
- We outline the general architecture of the solution, with separate techniques for handling image uploading and downloading.
- We then dive deep into how to handle image uploading, highlighting various techniques for controlling access over who can perform uploads.
- Two common techniques for securing uploads when using AWS are presigned URLs and presigned POSTs. We discuss how each works and when to use one over the other.
- We finish up by putting everything together and detailing the steps involved with uploading an image.
Detailed Show Notes
Want the complete episode outline with detailed notes? Sign up here: https://mobycast.fm/show-notes/
Support Mobycast
https://glow.fm/mobycast
End Song
Lazy Sunday by Roy England
More Info
For a full transcription of this episode, please visit the episode webpage.
We'd love to hear from you! You can reach us at:
- Web: https://mobycast.fm
- Voicemail: 844-818-0993
- Email: ask@mobycast.fm
- Twitter: https://twitter.com/hashtag/mobycast
- Reddit: https://reddit.com/r/mobycast
1 July 2020, 8:00 am
42 minutes 42 seconds

Replay of Ep 43 - The Birth of NoSQL and DynamoDb – Part 5
Show Details
Jon Christensen and Chris Hickman of Kelsus and Rich Staats of Secret Stache conclude their series on the birth of NoSQL and DynamoDB. They compare the NoSQL database, Leviathan, created by Chris’s startup in the late 1990s to today’s DynamoDB. A lot of things haven’t changed, even though technology has evolved. It’s cyclical. There are patterns and problems that continue to dominate.

Some of the highlights of the show include:
- Reason for Creation of NoSQL Database: How to scale database with Internet-scale applications to have a virtual pool of infinite storage that can be scaled out
- Main Architecture Components of Leviathan:
  - API client
  - Update distributor (UD)
  - Base server (storage node)
  - Shepherd (housekeeping management system)
- Additional core components included smart IP and storage abstraction layer (SAL)
- Leviathan mostly used C code and minimal Java code to support users
- Big difference between DynamoDB and Leviathan is request router and partition metadata system living on the server vs. living on the edge
- Leviathan was a closed system with an instance for every network or data center; not designed to run as a software as a service, like DynamoDB
- Leviathan was strongly consistent, unlike DynamoDB’s eventually consistent model
- Definition and Different Types of Transactions
- Shepherd was used to identify and address consistency, synchronous, and timing issues
- Rather than using a file system, Leviathan used relational databases
Links and Resources
DynamoDB
Microsoft SQL
Oracle DB
AWS IoT Greengrass
Kelsus
Secret Stache Media

Quotes:
“We had the same kind of problems that DynamoDB had - how do you scale your database dealing with Internet-scale applications and have this virtual pool of infinite storage that can be scaled out.” Chris Hickman

“This system and this technology went through many iterations.” Chris Hickman

“You can’t have a 100% consistent state across everything. It’s just impossible. How do you do the right thing?” Chris Hickman

“The big difference between DynamoDB and Leviathan...is the request router and partition metadata system living on the server vs. living out at the edge.” Jon Christen
15 April 2020, 1:00 pm
41 minutes 5 seconds

Replay of Ep 42 - The Birth of NoSQL and DynamoDb – Part 4
Show Details

What’s under the hood of Amazon’s DynamoDB? Jon Christensen and Chris Hickman of Kelsus continue their discussion on DynamoDB, specifically about it’s architecture and components. They utilize a presentation from re:Invent titled, Amazon DynamoDB Under the Hood: How we built a hyper-scale database.

Some of the highlights of the show include:
- Partition keys and global secondary indexes determine how data is partitioned across a storage node; allows you to scale out, instead of up
- Understand how a database is built to make architecture/component definitions less abstract
- DynamoDB has four components:
1. Request Router: Frontline service that receives and handles requests
2. Storage Node: Services responsible for persisting and retrieving data
3. Partition Metadata System: Keeps track of where data is located
4. Auto Admin: Handles housekeeping aspects to manage system
- What level of uptime availability do you want to guarantee?
- Replication: Strongly Consistent vs. Eventually Consistent
- Walkthrough of Workflow: What happens when, what does it mean when…
- DynamoDB architecture and components are designed to improve speed and scalability
- Split Partitions: Longer times that your database is up and the more data you put into it, the more likely you’re going to get a hot partition or partitions that are too big
Links and Resources
DynamoDB
re:Invent
Amazon DynamoDB Under the Hood: How we built a hyper-scale database
Paxos Algorithm
Amazon S3
Amazon Relational Database Service (RDS)
MongoDB
JSON
Kelsus
Secret Stache Media

Quotes:
“Keep in mind that data is partitioned across storage node, and that’s a key feature of being able to scale out, as opposed to scaling up.” Jon Christensen

“Amazon was opening up the kimono...how DynamoDB has been architected and constructed and how it works.” Chris Hickman

“Managed Service - they get to decide how it’s architected...because they also have to keep it up and live up to their SLA.” Chris Hickman

“The longer the time that your database is up and the more data you put into it, the more likely that you’re going to get a hot partition or partitions are just going to get too big.” Chris Hickman
8 April 2020, 1:00 pm
29 minutes 40 seconds

Replay of Ep 41 - The Birth of NoSQL and DynamoDb – Part 3
Show Details

Jon Christensen and Chris Hickman of Kelsus and Rich Staats of Secret Stache continue their discussion on the birth of NoSQL and DynamoDB. They examine DynamoDB’s architecture and popularity as a solution for Internet-scale databases.

Some of the highlights of the show include:
- Challenges, evolution, and reasons associated with Internet-scale data
- DynamoDB has been around a long time, but people are finally using it
- DynamoDB and MongoDB are document or key value stores that offer scalability and event-driven programming to reduce complexity
- Techniques for keeping NoSQL database’s replicated data in sync
- Importance of indexes to understand query patterns
- DynamoDB’s Table Concept: Collection of documents/key value items; must have partition key to uniquely identify items in table and determine data distribution
- Sort keys create indexes (i.e. global/local secondary index) to support items within partitioning
- Query a DynamoDB database based on what’s being stored and using keys; conduct analysis on queries to determine their effectiveness
Links and Resources
AWS
re:Invent
DynamoDB
NoSQL
MongoDB
Groupon
JSON
PostgreSQL
Kelsus
Secret Stache Media

Quotes:

“Kind of what drove this evolution from SQL to NoSQL - realizing that the constraints were now different, the economics of the resources that were being used.” Chris Hickman

“People are realizing that Dynamo is not an ugly stepchild.” Jon Christensen

“Event-driven programming...it’s very popular, and it’s going to become even more popular.” Chris Hickman

End Song
Benirrás Nights by Roy England ft. Dovetracks
1 April 2020, 1:00 pm
33 minutes 23 seconds

Replay of Ep 40 - The Birth of NoSQL and DynamoDb – Part 2
Show Details

Jon Christensen and Rich Staats learn about Chris Hickman’s first venture-backed startup (circa 1998) and its goal to build a database for Internet-scale applications. His story highlights what software is all about – history repeating itself because technology/software is meant to solve problems via new tools, techniques, and bigger challenges at bigger scales.
Some of the highlights of the show include:
- Why Chris left Microsoft and how much it cost him; yet, he has no regrets
- Chris’s concept addressed how to build a scalable database layer; how to partition, chart, and cluster; and how to make it highly available and a completely scale-out architecture
- Chris couldn’t use the code he had created for it while at Microsoft; but from that, he learned what he wouldn’t do again
- Chris let the file system be the database at Microsoft, and the project was named, Internet File Store (IFS); it used backend code and was similar to S3
- Chris named his startup Viathan; had to do copyright, trademark, and domain name searches
- Data for the Microsoft project could be stored in files/XML documents; Viathan took a different approach and used relational databases instead of a file system
- Companies experienced problems at the beginning of the Internet; rest of ecosystem wasn’t developed and there weren’t enough people needing Internet solutions yet
- Viathan went through several iterations that led to patents being issued and being considered as Prior art
- Viathan’s technology couldn’t just be plugged in and turned on, applications had to be modified – a tough sell
- Chris did groundbreaking work for what would become DynamoDB
Links and Resources
AWS
DynamoDB
AWS re:Invent 2018 – Keynote with Werner Vogels
re:Invent
DeepRacer
JSON
Moby Dick
MongoDB Acid Compliance
Prior Art
Kelsus
Secret Stache Media
25 March 2020, 1:00 pm
33 minutes 1 second

Replay of Ep 39 - The Birth of NoSQL and DynamoDB
Chris Hickman and Jon Christensen of Kelsus and Rich Staats from Secret Stache offer a history lesson on the unique challenges of data at “Internet scale” that gave birth to NoSQL and DynamoDB. How did AWS get to where it is with DynamoDB? And, what is AWS doing now?

Some of the highlights of the show include:
- Werner’s Worst day at Amazon: Database system crashes during Super Saver Shipping
- Amazon strives to prevent problems that it knows will happen again by realizing relational database management systems aren’t built/designed for the Internet/Cloud
- Internet: Scale up vs. scale out via databases or servers; statefulness of databases prevents easy scalability
- Need sharding and partitioning of data to have clusters that can be scaled up individually
- Amazon’s Aha Moment: Realization that 90% of data accessed was simplistic, rather than relational; same thing happened at Microsoft - recall the Internet Tidal Wave memo?
- Challenge of building applications using CGI bin when Internet was brand new
- Solution: Build your own Internet database; optimize for scalability
Links
AWS
re:Invent
DynamoDB
NoSQL
AWS re:Invent 2018 - Keynote with Andy Jassy
AWS re:Invent 2018 - Keynote with Werner Vogels
Oracle Database
Bill Gates’ Internet Tidal Wave
CGI Bin
Kelsus
Secret Stache Media

End Song
Whisper in a Dream by Uskmatu

More Info
We'd love to hear from you! You can reach us at:
- Web: https://mobycast.fm
- Voicemail: 844-818-0993
- Email: ask@mobycast.fm
- Twitter: https://twitter.com/hashtag/mobycast
- Reddit: https://reddit.com/r/mobycast
18 March 2020, 1:00 pm
27 minutes 29 seconds

Replay of Ep 14. Stop Worrying About Cloud Lock-in
Original Show Notes:
At the recent Gluecon event, a popular topic centered around how to prevent Cloud Lock-in. Chris Hickman and Jon Christensen of Kelsus and Rich Staats from Secret Stache discuss why you your time is better spent focusing on one cloud provider. If/when Cloud Lock-in becomes an issue, you will have the resources to deal with it.
Some of the highlights of the show include:
- AWS Fargate is ‘serverless ECS’. You don’t need to manage your own cluster nodes. This sounds great, but we’ve found the overhead of managing your own cluster to be minimal. Fargate is more expensive than ECS, and you have greater control if you manage your own cluster.
- Cloud lock-in was a huge concern among people at Gluecon 2018. People from large companies talked about ‘being burned’ in the past with vendor lock-in. The likely risks are (1) price gouging and (2) vendors going out of business.
- Cloud allows people to deploy faster and more cheaply than running their own hardware, as long as you don’t have huge scale. Few businesses get large enough to need their own data center on-prem to save money.
- Small and startup companies often start off in the Cloud. Big companies often have their own data centers and they are now migrating to the Cloud.
- AWS does allow you to run their software in your own data center, but this ties you to AWS.
- There is huge complication and risk to architecting a system to run in multiple cloud environments, and it almost certainly wouldn’t run optimally in all clouds.
- We think the risk of AWS hiking prices drastically, or going out of business, is essentially zero.
- If you were building a microservice-based multi-cloud system, some of the difficulties include: Which cloud hosts the database? How do I spread my services across 2 clouds? What about latency between cloud providers networks? How do I maintain security? How do I staff people who are experts at operating in both clouds?
- It’s clear that lock-in is a real fear for many companies, regardless of our opinion that it shouldn’t be such a concern.
- Jon thinks the fear of lock-in may drive cloud providers toward standardization; Chris thinks AWS doesn’t have a compelling reason to standardize since they’re the industry leader.
- Our advice: as a small or medium size company, don’t worry about cloud lock in. If you get big enough that it’s really a concern, we recommend building abstractions for the provider-specific parts of your system, and having a backup of your system ready to run in a 2nd cloud provider, but don’t try to run them concurrently.
Links and Resources
11 March 2020, 2:00 pm
2 minutes 15 seconds

Learn cloud native software development by podcast

Start with 39. The Birth of NoSQL and DynamoDB – Part 1.
If you like that one, finish the five part series.
If you still want more? Ask me for advice. I'll tell you what are the next best ones at jon@mobycast.fm.

8 March 2020, 8:00 pm
1 hour 8 minutes

Automate all the things - Updating container secrets using CloudWatch Events + Lambda
In this episode, we cover the following topics:
- Developing a system for automatically updating containers when secrets are updated is a two-part solution. First, we need to be notified when secrets are updated. Then, we need to trigger an action to update the ECS service.
- CloudWatch Events can be used to receive notifications when secrets are updated. We explain CloudWatch Events and its primary components: events, rules and targets.
- Event patterns are used to filter for the specific events that the rule cares about. We discuss how to write event patterns and the rules of matching events.
- The event data structure will be different for each type of emitter. We detail a handy tip for determining the event structure of an emitter.
- We discuss EventBridge and how it relates to CloudWatch Events.
- We explain how to create CloudWatch Event rules for capturing update events emitted by both Systems Manager Parameter Store and AWS Secrets Manager.
- AWS Lambda can be leveraged as a trigger of CloudWatch Events. We explain how to develop a Lambda function that invokes the ECS API to recycle all containers.
- We finish up by showing how this works for a common use case: using the automatic credential rotation feature of AWS Secrets Manager with a containerized app running on ECS that connects to a RDS database.
Detailed Show Notes
Want the complete episode outline with detailed notes? Sign up here: https://mobycast.fm/show-notes/

Support Mobycast
https://glow.fm/mobycast
End Song
Night Sea Journey by Derek Russo

More Info
For a full transcription of this episode, please visit the episode webpage.
We'd love to hear from you! You can reach us at:
- Web: https://mobycast.fm
- Voicemail: 844-818-0993
- Email: ask@mobycast.fm
- Twitter: https://twitter.com/hashtag/mobycast
- Reddit: https://reddit.com/r/mobycast
4 March 2020, 1:00 pm
49 minutes 21 seconds

Database Soup - Explaining ACID, BASE, CAP - Part 3
In this episode, we cover the following topics:
- In this new series, we are discussing database consistency models explained in three acts. This episode is "Act III: Eventual consistency saves the web (circa early 2000s)".
- We explain eventual consistency and the motivation behind the philosophy.
- The BASE acronym stands for three key properties of a distributed system that utilizes eventual consistency. We define and explain these BASE attributes:
- Basically available
- Soft state
- Eventual consistency
- We share the story of Werner Vogel's keynote at re:Invent 2018, where he outlined the reasons why DynamoDB was created. In particular, DynamoDB allows for an eventual consistency data model.
- Interestingly, the DynamoDB story closely parallels what happened when Chris was at Microsoft. It just happened at least 6 years earlier.
- We then wrap up everything we have learned about ACID, CAP, and BASE by providing some guidelines on when to choose ACID vs. BASE systems.
Detailed Show Notes
Want the complete episode outline with detailed notes? Sign up here: https://mobycast.fm/show-notes/

Support Mobycast
https://glow.fm/mobycast
End Song
Whisper In A Dream (Feathericci Remix) by Uskmatu

More Info
For a full transcription of this episode, please visit the episode webpage.
We'd love to hear from you! You can reach us at:
- Web: https://mobycast.fm
- Voicemail: 844-818-0993
- Email: ask@mobycast.fm
- Twitter: https://twitter.com/hashtag/mobycast
- Reddit: https://reddit.com/r/mobycast
26 February 2020, 1:00 pm
More Episodes? Get the App