Screaming in the Cloud

By Corey Quinn

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store.

Category: Technology

Open in iTunes

Open RSS feed

Open Website

Rate for this podcast


Screaming in the Cloud with Corey Quinn features conversations with domain experts in the world of Cloud Computing. Topics discussed include AWS, GCP, Azure, Oracle Cloud, and the "why" behind how businesses are coming to think about the Cloud.

Episode Date
Episode 23: Most Likely to be Misunderstood: The Myth of Cloud Agnosticism
It is easy to pick apart the general premise of Cloud agnosticism being a myth. What about reasonable use cases? Well, generally, when you have a workload that you want to put on multiple Cloud providers, it is a bad idea. It’s difficult to build and maintain. Providers change, some more than others. The ability to work with them becomes more complex. Yet, Cloud providers rarely disappoint you enough to make you hurry and go to another provider. Today, we’re talking to Jay Gordon, Cloud developer advocate for MongoDB, about databases, distribution of databases, and multi-Cloud strategies. MongoDB is a good option for people who want to build applications quicker and faster but not do a lot of infrastructural work. Some of the highlights of the show include: Easier to consider distributed data to be something reliable and available, than not being reliable and available People spend time buying an option that doesn’t work, at the cost of feature velocity If Cloud provider goes down, is it the end of the world? Cloud offers greater flexibility; but no matter what, there should be a secondary option when a critical path comes to a breaking point Hand-off from one provider to another is more likely to cause an outage than a multi-region single provider failure Exclusion of Cloud Agnostic Tooling: The more we create tools that do the same thing regardless of provider, there will be more agnosticism from implementers Workload-dependent where data gravity dictates choices; bandwidth isn’t free Certain services are only available on one Cloud due to licensing; but tools can help with migration Major service providers handle persistent parts of architecture, and other companies offer database services and tools for those providers Cost may/may not be a factor why businesses stay with 1 instead of multi-Cloud How much RPO and RTO play into a multi-Cloud decision Selecting a database/data store when building; consider security encryption Links: Jay Gordon on Twitter MongoDB The Myth of Cloud Agnosticism Heresy in the Church of Docker Kubernetes Amazon Secrets Manager JSON Digital Ocean
Aug 10, 2018
Episode 22: The Chaos Engineering experiment that is us-east-1
Trying to convince a company to embrace the theory and idea of Chaos Engineering is an uphill battle. When a site keeps breaking, Gremlin’s plan involves breaking things intentionally. How do you introduce chaos as a step toward making things better? Today, we’re talking to Ho Ming Li, lead solutions architect at Gremlin. He takes a strategic approach to deliver holistic solutions, often diving into the intersection of people, process, business, and technology. His goal is to enable everyone to build more resilient software by means of Chaos Engineering practices. Some of the highlights of the show include: Ho Ming Li previously worked as a technical account manager (TAM) at Amazon Web Services (AWS) to offer guidance on architectural/operational best practices Difference between and transition to solutions architect and TAM at AWS Role of TAM as the voice and face of AWS for customers Ultimate goal is to bring services back up and make sure customers are happy Amazon Leadership Principles: Mutually beneficial to have the customer get what they want, be happy with the service, and achieve success with the customer Chaos Engineering isn’t about breaking things to prove a point Chaos Engineering takes a scientific approach Other than during carefully staged DR exercises, DR plans usually don’t work Availability Theater: A passive data center is not enough; exercise DR plan Chaos Engineering is bringing it down to a level where you exercise it regularly to build resiliency Start small when dealing with availability Chaos Engineering is a journey of verifying, validating, and catching surprises in a safe environment Get started with Chaos Engineering by asking: What could go wrong? Embrace failure and prepare for it; business process resilience Gremlin’s GameDay and Chaos Conf allows people to share experiences Links: Ho Ming Li on Twitter Gremlin Gremlin on Twitter Gremlin on Facebook Gremlin on Instagram Gremlin: It’s GameDay Chaos Engineering Slack Chaos Conf Amazon Leadership Principles Adrian Cockcroft and Availability Theater Digital Ocean
Aug 08, 2018
Episode 21: Remember when RealNetworks used to-- BUFFERING
Are you about to head off to college? Interested in DevOps and the Cloud? Is there a good way for someone like you who is starting out in the world of technology to absorb the necessary skills? The Open Source Lab (OSL) at Oregon State University (OSU) is one program that helps students and serves as a career accelerator. OSL is a unicorn because OSU is willing to invest in open source. Today, we’re talking to Lance Albertson, director of OSL at OSU. OSL does a variety of projects to provide private Clouds that are neutrally hosted on its premises. The lab also gives undergraduate students hands-on experience with DevOps skills, including dealing with configuration management, deploying applications, learning how applications deploy, working with projects, and troubleshooting issues. OSL is for any student who has a general interest or passion for it, and a willingness to learn. Some of the highlights of the show include: Workflow focuses on what students need to learn about Linux and giving access to various repos; then they experience the lab’s configuration management suite Interview Process: Put out a posting, student submits an application online, each candidate is reviewed, student is given a screening quiz, If a student passes the screening process, they are brought in for an in-person interview for personality and technical questions Students tend to initially have the least amount of experience and most difficulty with a repository that has multiple people committing to it and dealing with PRs Spinning up VMs and understanding how configuration management is connected, how services communicate, and how to set up an application Round-Robins and System Sprint Meetings: Focus on discussing and documenting processes, issues, suggestions, comments, and other information Younger students are mentored by Lance and the older students; every generation has to evolve because the environment and industry evolve OSL made OpenStack work on POWER8, PowerPC, and PowerPC little-endian; gateway into Cloud - having OpenStack instance to offer services Vast majority of OSL’s revenue comes from donations; no direct support from the university; finding companies to serve as sponsors is beneficial to all Future of OSL: Providing more Cloud-like services; creating a more internal, private Cloud’ and containerized ways of running or deploying applications Links: Apache Software Foundation BusyBox Buildroot Chef Ruby Freenode OpenStack Sphinx Docker Neutron Seth Rackspace CoreOS Kubernetes Digital Ocean
Aug 01, 2018
Episode 20: The Wizard of AWS
Today, we’re talking to Jeff Barr, vice president and chief evangelist at Amazon Web Services (AWS). He founded the AWS Blog in 2004 and has written more than 2,900 posts for it and another 1,100 for his personal blog. As chief evangelist, Jeff strives to explain the benefits of Cloud computing and Web services to anyone who will listen. Jeff is the voice of AWS. He does what he does best - exploits his superpower of explaining technology in ways that people can understand it. Jeff tries to be the same person all the time. He loves to meet people and go out of his way to say “Hello.” So, if you see him at re:Invent, say “Cheese” and take a selfie with him! Some of the highlights of the show include: Jeff uses AWS Workspaces for his blog; one of Jeff’s blogging principles is to not take anybody else's word for anything to the absolute best of his technical ability Zero Client: Jeff has no rotating hardware, disk drives, just a zero client; wherever he is, it's the same workspace AWS has something for everyone; it build things in response to customers’ questions, requests, and feedback Naming Services and Products: Is it helpful? Is it descriptive? Does it have any hidden meanings? Amazonian DNA and Dog Friendly Workspace: Jeff went from super fearful to accepting, to now thinking of dogs as incredible creations because they add fun and excitement to the office As part of hiring, each interviewer is assigned Amazon leadership principles (LPs) to ask questions that measure a candidate against those LPs What is the secret to getting hired at Amazon? Study the LPs to understand what they're about and be able to express your philosophies and history with LPs re:Invent makes sure customers understand services - What is it? What does it do? How do they put it to work? What are the best use cases for it? Things can never be too simple; you start from zero, put a lot of different things in there, and then you need the feedback to build in simplicity AWS is following a more on-demand approach than traditional reserve instances; it opens the door to being used in a lot of ways AWS does a lot of work before a launch to make sure it’s got infrastructure, scaling, monitoring, and capacity in place If you are a customer, talk to AWS and let them know what they're doing right or wrong; write a blog post, tweet about it, share it with them in some way Is the breadth of product offerings from AWS too vast? Is it offering too many things?  AWS was not explicit about where it was going with Cloud computing or do analyses or projections about it; it simply launched SQS and let it speak for itself Customer feedback shapes what Amazon works on; customers share and then AWS re-prioritizes to make sure it’s delivering the right thing at the right time Remember: It's not just bits and bytes, it's about the organic life form Links: Jeff Barr on Twitter Jeff Barr on LinkedIn AWS AWS Blog Jeff Barr’s Blog Amazon Machine Images Zero Client AWS Workspaces AWS Lambda Amazon Leadership principles re:Invent The Robot Uprising Will Have Very Clean Floors Serverlessly Storing My Dad Jokes in a Dadabase Days Until re:Invent
Jul 25, 2018
Episode 19: I want to build a world spanning search engine on top of GCP
Some companies that offer services expect you to do things their way or take the highway. However, Google expects people to simply adapt the tech company’s suggestions and best practices for their specific context. This is how things are done at Google, but this may not work in your environment. Today, we’re talking to Liz Fong-Jones, a Senior Staff Site Reliability Engineer (SRE) at Google. Liz works on the Google Cloud Customer Reliability Engineering (CRE) team and enjoys helping people adapt reliability practices in a way that makes sense for their companies. Some of the highlights of the show include: Liz figures out an appropriate level of reliability for a service and how a service is engineered to meet that target Staff SRE involves implementation, and then identifying and solving problems Google’s CRE team makes sure Google Cloud customers can build seamless services on the Google Cloud Platform (GCP) Service Level Objectives (SLOs) include error budgets, service level indicators, and key metrics to resolve issues when technology fails Learn from failures through instant reports and shared post-mortems; be transparent with customers and yourself GCP: Is it part of Google or not? It’s not a division between old and new. Perceptions and misunderstandings of how Google does things and how it’s a different environment Google’s efforts toward customer service and responsiveness to needs Migrating between different Cloud providers vs. higher level services How to use Cloud machine learning-based products GCP needs to focus on usability to maintain a phase of growth Offer sensible APIs; tear up, turn down, and update in a programmatic fashion Promotion vs. Different Job: When you’ve learned as much as you can, look for another team to teach something new What is Cloud and what isn’t? Cloud deployments require SRE to be successful but SREs can work on systems that do not necessarily run in the Cloud. Links: Cloud Spanner Kubernetes Cloud Bigtable Google Cloud Platform blog - CRE Life Lessons Google SRE on YouTube
Jul 19, 2018
Episode 18: Sitting on the curb clapping as serverless superheroes go by
What’s serverless? Are you serverless now? Is going from enterprise to serverless a natural evolution? Or, is it a “that was fun, now let’s go ride our bikes” moment? Is serverless “just a toy?” Is it a wide and varied ecosystem, or is it Lambda plus some other randos? What's up with serverless vs. containers? Today, Forrest Brazeal is here to answer those questions and discuss pros and cons of serverless. He was a senior Cloud architect prior to joining Trek10. Forrest spent several years leading AWS and serverless engineering projects at Infor. He understands the challenges faced by enterprises moving to the Cloud and enjoys building solutions that provide maximum business value at a minimal cost.  Some of the highlights of the show include: Bimodality: Backend development going away and being replaced by managed services; undifferentiated items are being moved to the Cloud Serverless is application designs with “Backend as a Service” (BaaS) and/or “Functions as a Service” (FaaS) platforms; everything is managed for you AWS Lambda: Is it today’s trend or a bias that everyone is using it; Lambda makes up 80% of current FaaS adoption Serverless Ecosystem: You can build it however you want, and you’re doing it right; but don’t take that at face-value; no two Lambda environments are alike Cloud services at this scale have not been knitted together to form applications that are serving major workloads; best practices need to be established Native Cloud providers will consolidate, and individual frameworks will be created with components of application stacks tied together to build systems Serverless vs. Containers: No need for disparity - we can learn to get along; people use containers because it is easier than going serverless Serverless Heroes series features people thinking out-of-the-box and helps identify emerging trends; serverless is growing, and it’s not just about startups Went from working with a Sharpie to Procreate for the FaaS and Furious cartoon series; serverless component of process is for invoicing     Changes? Packaging to handle sharing; more knobs on console; unified process needed because too many building own workflow and tooling Certification: Proof-positive that you know what you’re talking about or is it questionable value if not backing up expertise in the real world? Links: Forrest Brazeal on Twitter Invoiceless Summon the vast power of certification - Dilbert cartoon Trek10 blog A Cloud Guru ThinkfaaS podcast A Cloud Guru - Serverless Superheros Why We’re Excited About AWS AppSync Serverless Architectures with Mike Roberts AWS Lambda AWS Serverless Application Model (SAM) Procreate AWS Certified Cloud Practitioner Serverlessconf Digital Ocean
Jul 11, 2018
Episode 17: Pouring Kubernetes on things with reckless abandon
DevOps as a service describes what Reactive Ops is trying to do, who it’s trying to help, and what problems it’s trying to solve. It’s passion to deliver service where human beings help other human beings is done through a group of engineers who are extremely good at solving problems. Sarah Zelechoski is the vice president of engineering at Reactive Ops, which defines the world’s problems and solves them by pouring Kubernetes on top of them. The team focuses on providing expert-level guidance and a curated framework using Kubernetes and other open source tools. Sarah's greatest passion is helping others, which encompasses advocating for engineers and rekindling interest in the lost art of service in the tech space. Some of the highlights of the show include: Kubernetes is changing the way people work; it offers a way to release a product, provide access to it, and behaviors when you deploy it Any person/business can use Kubernetes to mold their workflow Kubernetes is complex and has sharp edges; it has only recently become productive because of its community finding and reporting issues Business value of deploying Kubernetes to a new environment: Flexibility and uniform system of management; and it can provide a context shift Implementation Challenges with Workshops/Tutorials: Valuable entry level strategy for people learning Kubernetes; but the translation is not easy About 85% of the work Reactive Ops does is helping its customers get on to Kubernetes is spent on application architecture If thinking about moving to Kubernetes, how well will your current applications translate? Do you want to start over from scratch? Value in paying someone to do something for you Using Defaults: Try initially until you realize what you need; Kubernetes gives you options, but it’s a challenging path to go from defaults to advanced Deploying a workload between all major Cloud providers is possible, but there are challenges in managing multiple regions or locations Cluster Ops: Managed Kubernetes clusters where Reactive Ops stays on the map, watches them, and puts them on pager, so you can continue your work without having to worry Links: Sarah Zelechoski on Twitter Reactive Ops Kubernetes GKE from GCB AKS from Azure EKS from AWS Kops Terraform Slack
Jul 04, 2018
Episode 16: There are Still Servers, but We Don't Care About Them
Are you interested in going beyond basic monitoring and visibility? Need tools to build and operate serverless applications and extract business intelligence? IOpipe provides extended visibility and metrics around AWS Lambda, including profiling, core dumps, and incoming input events. Today, we’re talking to Erica Windisch, who is the founder and CTO of IOpipe. She brings her experience in building developer and operational tooling to serverless applications. Erica also has more than 17 years of experience designing and building Cloud infrastructure management solutions. She was an early and longtime contributor to OpenStack and maintainer of the Docker project. Some of the highlights of the show include: Nomenclature Battle: Serverless vs. stateless Building a window of visibility into Lambda: Talking to users and assessing needs/pain points Observability of the infrastructure: Necessary evil to get to automated healing Using Lambda at significant levels of scale; some companies grow usage, others go all in right away Current state of Lambda ecosystem Is Lambda stable? Indications and no formal SLA How issues manifest and are exposed Trends include cold starts, hours-long failures, and multiple function evokes Infrastructure powering IOpipe: Lambda issues may impact performance of monitoring system, but IOpipe is not necessarily dependent on Lambda Future of Lambda: Builds applications a specific way, but there are limitations What would Erica change about Lambda? Run function and define handlers Lambda functions can be difficult to understand; some developers do not have familiarity and create bottlenecks Capacity limits around Lambda can be difficult to establish Links: Erica Windisch on Twitter Erica Windisch on Twitch IOpipe 12-Factor App Cloud Custodian in Lambda Velocity London ServerlessConf London re:Invent AWS Glue
Jun 27, 2018
Episode 15: Nagios was the Original Call of Duty
Let’s chat about the Cloud and everything in between. The people in this world are pretty comfortable with not running physical servers on their own, but trusting someone else to run them. Yet, people suffer from the psychological barrier of thinking they need to build, design, and run their own monitoring system. Fortunately, more companies are turning to Datadog. Today, we’re talking to Ilan Rabinovitch, Datadog’s vice president of product and community. He spends his days diving into container monitoring metrics, collaborating with Datadog’s open source community, and evangelizing observability best practices. Previously, Ilan led infrastructure and reliability engineering teams at various organizations, including Ooyala and He’s active in the open source and DevOps communities, where he is a co-organizer of events, such as SCALE and Texas Linux Fest. Some of the highlights of the show include: Datadog is well-known, especially because it is a frequent sponsor More organizations know their core competency is not monitoring or managing servers Monitoring/metrics is a big data problem; Datadog takes monitoring off your plate Alternate ways, other than using Nagios, to monitor instances and regenerate configurations Datadog is first to identify patterns when there is a widespread underlying infrastructure issue Trends of moving from on-premise to Cloud; serverless is on the horizon How trends affect evolution of Datadog; adjusting tools to monitor customers’ environments Datadog’s scope is enormous; the company tries to present relevant information as the scale of what it’s watching continues to grow Datadog’s pricing is straightforward and simple to understand; how much Cloud providers charge to use Datadog is less clear Single Pane of Glass: Too much data to gather in small areas (dashboards)   Why didn’t monitoring catch this? Alerts need to be actionable and relevant How to use Datadog’s workflow for setting alerts and work metrics Datadog’s first Dash user conference will be held in July in New York; addresses how to solve real business problems, how to scale/speed up your organization Links: Ilan Rabinovitch on Twitter Datadog Docker Adoption Survey Results   Rubric for Setting Alerts/Work Metrics Dash Conference re:Invent Nagios
Jun 20, 2018
Episode 14: Cheslocked and loaded
Do you need data captured that let you know when things don’t look quite right? Need to identify issues before they become major problems for your organization? Turn to Threat Stack, which has Cloud issues of its own, and helps its customers with their Cloud issues. Today, I’m talking to Pete Cheslock, who runs technical operations at Threat Stack, which handles security monitoring, alerting, and remediation. The company uses Amazon Web Services (AWS), but its customer base can run anywhere.   Some of the highlights of the show include: Challenges Threat Stack experienced with AWS and how it dealt with them Threat Stack helps companies improve their security posture in AWS Security shouldn’t be an issue, if providers do their job; shared responsibility Education is needed about what matters regarding security, avoiding mistakes Cloud is still so new; not many people have abroad experience managing it Scanning customer accounts against best practices to identify risks Threat Stack’s scanning tool is worthwhile, but most tools lack judgement and perspective Threat Stack offers context between host- and Cloud-based events; tying data together is the secret sauce You shouldn’t have to pay a bunch of money to have a robust security system Good operations is good security; update, patch, track, and perform other tasks Lack of validation about what services are going to be a successful or not Vendor Lock-in: Understand your choices when building your system Pervasiveness and challenge of containerization and Kubernetes Cloud reduces cycle time and effort to bring a product to market Amazon is a game changer with what it allows you to do and solve problems Links: Pete Cheslock Digital Ocean Threat Stack AWS re:Invent Kubernetes
Jun 13, 2018
Episode 13: Serverlessly Storing my Dad Jokes in a Dadabase
Aurora, from Amazon Web Services (AWS), is a MySQL-compatible service for complex database structures. It offers capabilities and opportunities. But with Aurora, you’re putting a lot of trust in AWS to “just work” in ways not traditional to relational database services (RDS). David Torgerson, Principal DevOps Engineer at Lucidchart, is a mystery wrapped in an enigma and virtually impossible to Google. He shares Lucidchart’s experience with migrating away from a traditional RDS to Aurora to free up developer time. Some of the highlights of the show include: Trade off of making someone else partially responsible for keeping your site up Lucidchart’s overall database costs decreased 25% after switching to Aurora Aurora unknowns: What is an I/Op in Aurora? When you write one piece of data, does it count as six I/Ops? Multi-master Aurora is coming for failover time and disaster recovery purposes Aurora drawbacks: No dedicated DevOps, increased failover time, and misleading performance speed Providers offer ways to simplify your business processes, but not ways to get out of using their products due to vendor and platform lock-in Lucidchart is skeptical about Aurora Serverless; will use or not depending on performance Links: Corey's architecture diagram on AWS Lucidchart Lucidchart’s Data Migration to Amazon Aurora Preview of Amazon Aurora Multi-master Sign Up This is My Architecture re:Invent Digital Ocean
Jun 06, 2018
Episode 12: Like Normal Cloud Services, but More Depressing
Does your job challenge and motivate you? Does it utilize your skills? Or, are you ready to go job hunting? Do you want an awesome job that is a resume booster? Companies should be supportive of their employees finding a job that matches their skills and interests. Also, when hiring, companies should offer thoughtful processes for interviews.   Today, I’m talking to Sarah Withee, a polyglot software engineer, mentor, teacher, and robot tinkerer. Sarah went job hunting, and after several job interviews, she finally found a job that made her super happy at Arcadia Healthcare Solutions. Sarah compares the interview processes she experienced at big name tech companies that offer Cloud services. Some of the highlights of the show include: Companies sometimes lose sight that even interview interactions need to be a two-way sale Interviews often involve talking to many people; and if several are bad, that forms a negative impression of the company Companies need to provide interview training and follow the same standards Don’t farm out challenging or unfamiliar issues when interviewing candidates Sarah is very competent, but she is new to Cloud platforms; she is like a sponge, who enjoys learning and having a bare knowledge of new technology How HIPAA regulations impact Sarah’s learning and software engineering work; she has to be more aware of security and safety of healthcare data Being a teacher and mentor affects how Sarah learns new things; everybody learns slightly differently In the Cloud space, know which direction you want to go and start with simpler things to learn the basics; focus on what is relevant to what you are working on Links: Sarah Withee on Twitter #speakerconfessions Sarah Withee on Twitter Sarah Withee Blog Sarah Withee Resume Digital Ocean AWS Azure
May 30, 2018
Episode 11: Hickory Dickory Docker
Docker went from being a small startup to an enterprise company that changed the way people think about their infrastructure to now, where its relevance is somewhat minimal. The conversation is no longer around the container level. Docker has become commonplace. Today, we’re talking to Jérôme Petazzoni, formerly of Docker. While he was with the company for about 8 years, Docker definitely experienced a roller coaster ride.   Some of the highlights of the show include: Amount of work conducted on the enterprise vs. community editions Docker was so widely adopted because its core technology was open source Challenge is to build a viable business and revenue model for the long run Similarities between Docker and Red Hat open source platforms Docker went from six people working in a garage to having a few hundred employees and $1.3 billion valuation Changes happened, but they were gradual; the changes were necessary to be a profitable and sustainable company Contingent of internal and external people believed that Docker was the answer for whatever problem surfaced; Docker would save you, but not always Balancing Act: Pushing forward with a correct message and regulating enthusiasm Networking and Docker for dummies; confusion and problems of things not working as expected have been resolved Things will continue to shift; Kubernetes and the orchestration battle What was unthinkable, could happen by companies pushing the envelope and making progress Will who you have as your Cloud provider stop mattering? It depends. All major Cloud providers plan to offer managed Kubernetes services and what Jérôme thinks of them Jérôme’s opinion on whether Kubernetes will follow this same path as Docker What does the road ahead look like for infrastructure automation? There is potential and lots of best practices in Cloud environments. Links: Jérôme Petazzoni on Twitter Docker Crunch Base Digital Ocean Red Hat Corey's Heresy in the church of docker talk Kubernetes ZooKeeper Azure
May 23, 2018
Episode 10: Education is Not Ready for Teacherless
Like migrating caribou, you tend to follow the trends of what clients are doing, which dictates what you work on as a consultant. Today, we’re talking to Lynn Langit, an independent Cloud architect. She is an AWS Community Hero, Google Cloud developer expert, and former Microsoft MVP. Lynn is a lifelong learner, and she has worked broad and deep across all three large providers. These days, she works mostly with Google Cloud and AWS, rather than Azure, because that’s what her clients are using. Some of the highlights of the show include: Differences between the West Coast and global use of Cloud Education is key; Lynn is th co-founder of Lynn helped create curriculum and resources for school-age children; even her young daughter taught classes on how to code Training for teachers was also needed, so TKP Labs was formed to offer fee-based teacher and developer training Lynn started with classroom training, but has transitioned to online learning Lynn is focusing on Big Data projects and using tools to solve real-world problems Pre-processing and batching data, but not streaming it AWS, Azure, and Google Cloud are all coming out with Big Data-oriented tools Companies need to understand when the market is ready to accept a new paradigm; in the data world, change is more slow than in the programming world If you touch a database and get burned, you are not willing to use it again; or you may have never tried to archive your data; hire a consultant to help you Machine learning APIs give customers value quickly; review them before building custom models Migrating data can be a costly project and restricts where the data lives As Cloud proliferates, how will that impact technical education? Lynn’s Cloud for College Students to the rescue! Shift from interactive to unidirectional, one-to-many learning styles; the Cloud is ready for serverless, but education is not ready for teacherless Road that many of us walked to get to technical skills no longer exists; how to become a modern technologist Ageism: By age 40, you are considered a manager or useless; don’t be afraid to learn something new Links: Digital Ocean AWS Community Hero Microsoft Azure Digigirlz TKP Labs Lynn Langit on Commonwealth Scientific and Industrial Research Organisation Google BigQuery Amazon Athena AWS Glue Cloud Dataflow Cloud Dataprep Lambda Amazon EC2 Learn Python the Hard Way
May 16, 2018
Episode 9: Cloud Coreyography
Microsoft has experienced a renaissance. By everything that we've seen coming out of Microsoft over the past few years, it feels like the company is really walking the walk. Instead of just talking about how it’s innovative, it’s demonstrating that. Microsoft has been on an amazing journey, making the progression from telling customers what they need to listening to them and responding by building what they ask for. Today, we’re talking to Corey Sanders, Corporate Vice President of Azure Compute at Microsoft. Some of the highlights of the show include: Customers are asking for Microsoft to help them through support and enabling platforms Storytelling efforts through advocates, who play a double role – engaging and defending Microsoft Customers moving to the Cloud are focused on a continuum and progression; they have stuff to move from one location to another and want all the benefits–better agility, faster startup time, etc. Virtual serial console into existing VMs; this is how people are using this and Microsoft is going to, if not encourage this behavior, at least support it Microsoft is the only Cloud with a single-instance SLA Serial consoles: Windows' has seen less usage, partly due to operational aspects of Windows vs. Linux. It's not a GUI; it's scripting. Does the operating system matter? From a Cloud perspective, it shouldn't have to matter; you should be able to deploy it the way you want Edge enables much more complex and segregated scenarios; that combination with cognitive searches running locally will make it accessible anywhere Branding challenge as customers start to notice that devices are smarter and more complex; will they lose awareness that Microsoft Azure is powering most of these things - they shouldn’t care An awareness of not just what's possible, but what's coming; the democratization of AI Education and fear gap of trying something new and taking that first step; make products and services stupid and simple to use Customers return to add cognitive services and AI capabilities to existing, running deployments, environments, and applications Multi-Cloud solutions can be successful, but there's a caveat; they’re actually built on a service-by-service perspective Azure Stack, offers consistency, but some people may place blame on it for poor data center management practices; some expectations and regulations may be frustrating to some customers, but lets Microsoft offer a consistent experience Freedom and flexibility have been challenges for Microsoft and other products for private Clouds What people need to understand about Azure, including from a durability and reliability experience To some extent, scale becomes a necessary prerequisite for some applications Microsoft has taken many steps and is the leader in various areas Links: ReactiveOps Microsoft Azure Corey Sanders on Twitter The Robot Uprising Will Have Very Clean Floors Kubernetes Cassandra Azure Stack
May 09, 2018
Episode 8: A Corporate Prisoner's Dilemma
Have you dabbled with IT infrastructure in AWS? Have you been through the process of AWS partnership? Does being an AWS partner add value? Amazon seeks partners that helps drive its business, goals, and value. Today, we’re talking to Justin Brodley, the vice president of Cloud engineering at Ellie Mae. He has been through the AWS partnership process and shares his thoughts about it. He encourages you to find the right partner for your business! Some of the highlights of the show include: Different levels and types of AWS partnerships Shakedown vs. opportunity method for new leads; lead generation expectations Amazon’s improvements eroding business models Partners trying to pivot, but not exclusive to AWS Whether to invest in multi-Cloud Amazon can’t scale its sales team to handle everybody; views partner program as an extension of its salesforce Your company is important and you’re spending a lot of money, but Amazon may not care about you; partner market fills that gap and makes you feel important Corporate prisoner’s dilemma: Your tech company offers something that Amazon doesn’t; but what about when Amazon does offer it? Competitors’ horizontal move to become more diversified Amazon expects partners to offer products and services that it cannot offer yet If partners fail, Amazon decides to do it and do it better Is Amazon’s best interest geared toward its partners or you and your customers? Amazon needs to give incentives and support partners Links: Justin Brodley on Twitter Brodley Group Ellie Mae Digital Ocean AWS Partner Network Lambda API Gateway AWS re:Invent Salesforce Azure Rackspace
May 02, 2018
Episode 7: The Exact Opposite of a Job Creator
Monitoring in the entire technical world is terrible and continues to be a giant, confusing mess. How do you monitor? Are you monitoring things the wrong way? Why not hire a monitoring consultant!          Today, we’re talking to monitoring consultant Mike Julian, who is the editor of the Monitoring Weekly newsletter and author of O’Reilly’s Practical Monitoring. He is the voice of monitoring. Some of the highlights of the show include: Observability comes from control theory and monitoring is for what we can anticipate Industry’s lack of interest and focus on monitoring When there’s an outage, why doesn’t monitoring catch it?” Unforeseen things. Cost and failure of running tools and systems that are obtuse to monitor Outsource monitoring instead of devoting time, energy, and personnel to it Outsourcing infrastructure means you give up some control; how you monitor and manage systems changes when on the Cloud CloudWatch: Where metrics go to die Distributed and Implemented Tracing: Tracing calls as they move through a system Serverless Functions: Difficulties experienced and techniques to use Warm vs. Cold Start: If a container isn't up and running, it has to set up database connections Monitoring can't fix a bad architecture; it can't fix anything; improve the application architecture Visibility of outages and pain perceived; different services have different availability levels Links: Mike Julian Monitoring Weekly Copy Construct on Twitter Baron Schwartz on Twitter Charity Majors on Twitter Redis Kubernetes Nagios Datadog New Relic Sumo Logic Prometheus Honeycomb Honeycomb Blog CloudWatch Zipkin X-Ray Lambda DynamoDB Pinboard Slack Digital Ocean
Apr 25, 2018
Episode 6: The Robot Uprising Will Have Very Clean Floors
How many of you are considered heroes? Specifically, in the serverless Cloud, Twitter, and Amazon Web Services (AWS) communities? Well, Ben Kehoe is a hero. Ben is a Cloud robotics research scientist who makes serverless Roombas at iRobot. He was named an AWS Community Hero for his contributions that help expand the understanding, expertise, and engagement of people using AWS. Some of the highlights of the show include: Ben’s path to becoming a vacuum salesman History of Roomba and how AWS helps deliver current features Roombas use AWS Internet of Things (IoT) for communication between the Cloud and robot Boston is shaping up to be the birthplace of the robot overlords of the future AWS IoT is serverless and features a number of pieces in one service Robot rising of clean floors AWS Greengrass, which deploys runtimes and manages connections for communication, should not be ignored Creating robots that will make money and work well Roomba’s autonomy to serve the customer and meet expectations Robots with Cloud and network connections Competitive Cloud providers were available, but AWS was the clear winner Serverless approach and advantages for the intelligent vacuum cleaner Future use of higher-level machine learning tools Common concern of lock-in with AWS Changing landscape of data governance and multi-Cloud Preparing for migrations that don’t happen or change the world Data gravity and saving vs. spending money Links: Ben Kehoe on YouTube AWS AWS Community Hero AWS IoT Ben Kehoe on Twitter iRobot AWS Greengrass Shark Cat Medium Boston Dynamics AWS Lambda AWS SageMaker AWS Kinesis Google Cloud Platform Spanner Kubernetes Digital Ocean
Apr 18, 2018
Episode 5: The Last Mainframe with a Kickstart and a Double Clutch
How are companies evolving in a world where Cloud is on the rise? Where Cloud providers are bought out and absorbed into other companies? Today, we’re talking to Nell Shamrell-Harrington about Cloud infrastructure. She is a senior software engineer at Chef, CTO at Operation Code, and core maintainer of the the Habitat open source product. Nell has traveled the world to talk about Chef, Ruby, Rails, Rust, DevOps, and Regular Expressions. Some of the highlights of the show include: Chef is a configuration management tool that handles instance, files, virtual machine container, and other items. Immutable infrastructure has emerged as the best of practice approach. Chef is moving into next gen through various projects, including one called, Compliance - a scanning tool. Some people don’t trust virtualization. Habitat is an open source project featuring software that allows you to use a universal packaging format. Habitat is a run-time, so when you run a package on multiple virtual machines, they form a supervisor ring to communicate via leader/follower roles. Deploying an application depends on several factors, including application and infrastructure needs. It is possible to convert old systems with old deployment models to Habitat. Habitat allows you to lift a legacy application and put it into that modern infrastructure without needing to rewrite the application. You can ease in packages to Habitat, and then have Habitat manage pieces of the application. Habitat is Cloud-agnostic and integrates with public and private Cloud providers by exporting an application as a container. Chef is one of just a few third-party offerings marketed directly by AWS. From inception to deployment, there is a place for large Cloud providers to parlay into language they already speak. Operation Code is a non-profit that teaches software engineer skills to veterans. It helps veterans transition into high-paying engineering jobs. The technology landscape is ever changing. What skills are most marketable?   Operation Code is a learning by experience type of organization and usually starts people on the front-end to immediately see results. Links: Nell Shamrell-Harrington Nell Shamrell-Harrington on Twitter Nell Shamrell-Harrington on GitHub Operation Code Chef Ruby on Rails Rust Regular Expressions Habitat AWS Kubernetes Docker LinkedIn Learning GorillaStack (use discount code: screaming)
Apr 11, 2018
Episode 4: It's a Data Lake, not a Data Public Swimming Pool
Open source activism tends to focus on running on hardware you can trust and avoiding Cloud computing. The problem with some Cloud providers has to do with a conflict of interest between serving customers and how they generate revenue. It’s important for the customer to have control of their computer and their data in the Cloud. But what about their security and privacy?Today, we’re talking to Kyle Rankin, chief security officer at Purism and writer for Linux Journal. He is a Linux expert who decided to work at Purism because of the company’s belief in free software and the Linux community.Some of the highlights of the show include: Cloud providers have faced challenges when it comes to data privacy and who owns what. The word “Cloud” is overloaded, and it is unclear who is in control. Cloud providers can sabotage efforts to make programs work together. Cloud providers may not troll through data and exploit it. Yet, they develop tools for customers to be able to do that.   Even though Linux Journal stopped being printed and went digital, and was going under, it’s now back and taking a new approach. What matters to new readers and Linux users is now different than what was important to original readers. The more time you can spend to understand what’s happening behind the scenes will make you much more marketable and adaptable. Kyle explains whether Amazon Linux is becoming a viable concern and if distribution matters anymore. Now, it’s about running an application, not thinking about what it’s running on. Are there gangs of Cloud users? Do people look down on Azure users? The target is always moving and changing.   Check out Kyle’s book, Linux Hardening in Hostile Networks: Server Security from TLS to Tor. Links: Kyle Rankin on Twitter Purism Kyle Rankin’s book - Linux Hardening in Hostile Networks: Server Security from TLS to Tor Linux Journal 2.0 FAQ GorillaStack (use “screaming” for discount)
Apr 04, 2018
Episode 3: Turning Off Someone Else's Site as a Service
How do you encourage businesses to pick Google Cloud over Amazon and other providers? How do you advocate for selecting Google Cloud to be successful on that platform? Google Cloud is not just a toy with fun features, but is a a capable Cloud service. Today, we’re talking to Seth Vargo, a Senior Staff Developer Advocate at Google. Previously, he worked at HashiCorp in a similar advocacy role and worked very closely with Terraform, Vault, Consul, Nomad, and other tools. He left HashiCorp to join Google Cloud and talk about those tools and his experiences with Chef and Puppet, as well as communities surrounding them. He wants to share with you how to use these tools to integrate with Google Cloud and help drive product direction. Some of the highlights of the show include: Strengths related to Google Cloud include its billing aspect. You can work on Cloud bills and terminate all billable resources. The button you click in the user interface to disable billing across an entire project and delete all billable resources has an API. You can build a chat bot or script, too. It presents anything you’ve done in the Consul by clicking and pointing, as well as gives you what that looks like in code form. You can expose that from other people’s accounts because turning off someone else’s Website as a service can be beneficial. You can invite anyone with a Google account, not just ‘’ but ‘@’ any domain and give them admin or editor permissions across a project. They’re effectively part of your organization within the scope of that project. For example, this feature is useful for training or if a consultant needs to see all of your different clients in one dashboard, but your clients can’t see each other. Google is a household name. However, it’s important to recognize that advocacy is not just external advocacy, there’s an internal component to it. There’s many parts of Google and many features of Google Cloud that people aren’t aware of. As an advocate, Seth’s job is to help people win. Besides showing people how they can be successful on Google Cloud, Seth focuses on strategic complaining. He is deeply ingrained in several DevOps and configuration management communities, which provide him with positive and negative feedback. It’s his job to take that feedback and convert it into meaningful action items for product teams to prioritize and put on roadmaps. Then, the voice of the communities are echoed in the features and products being internally developed. Amazon has been in the Cloud business for a long time. What took Google so long? For a long time, Google was perceived as being late to the party and not able to offer as comprehensive and experienced services as Amazon. Now, people view Google Cloud as not being substandard, but not where serious business happens. It’s a fully feature platform and it comes down to preferences and pre-existing features, not capability. Small and mid-size companies typically pick a Cloud provider and stick with their choice. Larger companies and enterprises, such as Fortune 50 and Fortune 500 companies, pick multiple Clouds. This is usually due to some type of legal compliance issues, or there are Cloud providers that have specific features. Externally at Google, there is the Deployment Manager tool at It’s the equivalent of CloudFormation, and teams at Google are staffed full time to perform engineering work on it. Every API that you get by clicking a button on are viewing the API Docs accessible via the Deployment Manager. Google Cloud also partners with open source tools and corresponding companies. There are people at Google who are paid by Google who work full time on open source tools, like Terraform, Chef, and Puppet. This allows you to provision Google Cloud resources using the tools that you prefer. According to Seth, there’s five key pillars of DevOps: 1) Reduce organizational silos and break down barriers between teams; 2) Accept failures; 3) Implement gradual change; 4) Tooling and automation; and 5) Measure everything. Think of DevOps as an interface in programming language, like Java, or a type of language where it doesn’t actually define what you do, but gives you a high level of what the function is supposed to implement. With the SRE discipline, there’s a prescribed way for performing those five pillars of DevOps. Specific tools and technologies used within Google, some of which are exposed publicly as part of Google Cloud, enable the kind of DevOps culture and DevOps mindset that occur. A reason why Google offers abstract classes in programming is that there’s more than one way to solve a problem, and SRE is just one of those ways. It’s the way that has worked best for Google, and it has worked best for a number of customers that Google is working with. But there are some other ways, too. Google supports those ways and recognizes that there isn’t just one path to operational success, but many ways to reach that prosperity. The book, Site Reliability Engineering, describes how Google does SRE, which tried to be evangelized with the world because it can help people improve  operations. The flip side of that is that organizations need to be cognizant of their own requirements. Google has always held up along several other companies as a shining beacon of how infrastructure management could be. But some say there’s still problems with its infrastructure, even after 20-some years and billions invested. Every company has problems, some of them technical, some cultural. Google is no exception. The one key difference is the way Google handles issues from a cultural perspective. It focuses on fixing the problem and making sure it doesn’t happen again. There’s a very blameless culture. Conferences tend to include a lot of hand waving and storytelling. But as an industry, more war stories need to be told instead of pleasure stories. Conference organizers want to see sunshine and rainbows because that sells tickets and makes people happy. The systemic problem is how to talk about problems out in the open. Becoming frustrated and trying to figure out why computers do certain things is a key component of the SRE discipline referred to as Toil -  work tied to systems that either we don’t understand or don’t make sense to automate. Those going to Google Cloud to ‘move and improve’ tend to be a mix of those from other Cloud providers and those from on-premise data center deployments. Move and improve is where there are VMs in a data center, and they need to be moved to the Cloud. There are tiny differences around the Cloud-native paradigm and providers. There’s some key pillars: Does it handle restarts well? Is it highly available? Can it be containerized, even though containers aren’t necessarily required for Cloud native? Does it package all of its dependencies with it? Can it run on different operating systems? All of these things are generic, they’re not specific to a Cloud provider. Links: Google Cloud and blog Amazon Web Services HashiCorp Terraform Vault Consul Nomad Chef Puppet Kubernetes AutoML Monitorama Azure CloudFormation Ansible Elk Stack Site Reliability Engineering book for O’Reilly Fastly Hacker News Cloud Foundry Microsoft Cloud Alibaba Cloud Lambda Quotes by Seth: “Everything we do on Google Cloud is API First. Anytime you click a button in that Web UI, there is a corresponding API call, which means you can build automation, compliance, and testing around these various aspects.” “The IAM and permission management in Google Cloud is incredibly powerful. It leverages the same IAM permissions that G Suite has which is hosted Gmail, Calendar, and all of those other things.” “How do I get people who want to use Google Cloud or don’t know about Google Cloud? The ability to be successful on the platform.” “I would definitely say that any company you work at, whether the recruiter tells you that it’s all sunshine and rainbows and there’s nothing ever wrong is a lie.”
Mar 28, 2018
Episode 2: Shoving a SAN into us-east-1
When companies migrate to the Cloud, they are literally changing how they do everything in their IT department. If lots of customers exclusively rely on a service, like us-east-1, then they are directly impacted by outages. There is safety in a herd and in numbers because everybody sits there, down and out. But, you don’t engineer your application to be a little more less than a single point of failure. It’s a bad idea to use a sole backing service for something, and it’s unacceptable from a business perspective. Today, we’re talking to Chris Short from the Cloud and DevOps space. Recently, he was recognized for his DevOps’ish newsletter and won the People’s Choice Award for his DevOps writing. He’s been blogging for years and writing about things that he does every day, such as tutorials, codes, and methods. Now, Chris, along with Jason Hibbets, run the DevOps team for Some of the highlights of the show include: Chris’ writing makes difficult topics understandable. He is frank and provides broad information. However, he admits when he is not sure about something. SJ Technologies aims to help companies embrace a DevOps philosophy, while adapting their operations to a Cloud-native world. Companies want to take advantage of philosophies and tooling around being Cloud native. Many companies consider a Cloud migration because they’ve got data centers across the globe. It’s active-passive backup with two data centers that are treated differently and cannot switch to easily. Some companies do a Cloud migration to refactor and save money. A Cloud migration can result in you having to shove your SAN into the USC1. It can become a hybrid workflow. Lift and shift is often considered the first legitimate step toward moving to the Cloud. However, know as much as you can about your applications and RAM and CPU allowances. Look at density when you’re lifting and shifting. Know how your applications work and work together. Simplify a migration by knowing what size and instances to use and what monitoring to have in place. Some do not support being on the Cloud due to a lack of understanding of business practices and how they are applied. But, most are no longer skeptical about moving to the Cloud. Now, instead of ‘why cloud,’ it becomes ‘why not.’ Don’t jump without looking. Planning phases are important, but there will be unknowns that you will have to face. Downtime does cost money. Customers will go to other sites. They can find what they want and need somewhere else. There’s no longer a sole source of anything. The DevOps journey is never finished, and you’re never done migrating. Embrace changes yourself to help organizations change. Links: Chris Short on Twitter DevOps'ish SJ Technologies Amazon Web Services Cloud Native Infrastructure Oracle OpenShift Puppet Kubernetes Simon Wardley Rackspace The Mythical Man-Month Atlassian BuzzFeed Quotes by Chris: “Let’s not say that they’re going whole hog Cloud Native or whole hog cloud for that matter but they wanna utilize some things.” “They can never switch from one to the other very easily, but they want to be able to do that in the Cloud and you end up biting off a lot more than you can chew…” “Create them in AWS. Go. They gladly slurp in all your VM where instances you can create a mapping of this sized thing to that sized thing and off you go. But it’s a good strategy to just get there.” “We have to get better as technologists in making changes and helping people embrace change.”
Mar 21, 2018
Episode 1: Feature Flags with Heidi Waterhouse of LaunchDarkly
This podcast features people doing interesting work in the world of Cloud. What is the state of the technical world? Let’s first focus on the up or down, on or off function of feature flags. Today, we’re talking to Heidi Waterhouse, a technical writer turned Developer Advocate at LaunchDarkly, which is a feature flag service - a way to wrap a snippet of code around your feature and make it into an instrument to turn on or off. It lets you turn things on and off in your codebase quickly without having to do several commits. However, it is difficult to track it when there are more than about a dozen flags. So, LaunchDarkly provides a way to manage your features at scale with a usable interface and API. Some of the highlights of the show include: A feature flag allows you to hide items before you want them to go live on your Website. You hide it behind a feature flag, doing all the work ahead of time. Then, at some point, you turn it all on instantly without the risk of pushing untested code into your production. You can test at scale to gain authentic data. Test something with your team, your company’s employees, your customers, etc. However, no matter how good your integration tests are, there’s always wobbles to watch for in the system. With implementation, there are a few paths that can work, such as the massive reorganization path. Or, you can just start incrementally with feature flags for new features. LaunchDarkly thinks in the Cloud as the surface because it mostly works with people who are doing Web-based delivery of features. Major companies, like Google and Facebook, offer services similar to feature flags for their own development. They’re operating on such a giant scale that they have internal teams doing it. Companies use feature flags on the front-end and other purposes. It works through the whole stack from frontend page delivery, pricing tiers, white labeling, style sheets, to safer deployments. Do not focus on documentation. You should not have to read documentation for anything that you don’t own. Every feature should have documentation tied to its code. Create a customized experience. Feature flags effectively manage and minimize risk. There is always risk in the world, but what causes disaster is not just one failure. It is a multiplication of failures. This goes wrong and that goes wrong. Feature flagging breaks monolithic releases into tiny chunks that can go forward or backward. LaunchDarkly holds monthly meet-ups called, Test and Production. People share their use case regarding continuous integration, continuous deployment, DevOps, etc. Links: LaunchDarkly iPad Autodesk Slack IBM Quotes by Heidi: “What feature flags do is make it possible for you to push out a deployment with things hidden, we call it launching darkly.” “We’re all about avoiding risk, I think this is our motto this year, eliminate risk…you can’t eliminate risk, but you can make it much less risky.” “Go ahead and write your feature. You know that it’s hidden behind the magical feature flying curtain until you’re ready to turn it on.” “If 20 years of technical writing taught me anything, it’s that nobody wants to be reading documentation.”  
Mar 19, 2018