Coding Blocks

242 Episodes

Reverse

Designing Data-Intensive Applications – Lost Updates and Write Skew

2023-03-1901:24:48

What are lost updates, and what can we do about them? Maybe we don't do anything and accept the write skew? Also, Allen has sharp ears, Outlaw's gort blah spotterfiles, and Joe is just thinking about breakfast. The full show notes for this episode are available at https://www.codingblocks.net/episode206. News Thank you for the amazing reviews! iTunes: JomilyAnv Want to help us out? Leave us a review. Great book! Preventing Lost Updates Last episode we talked about weak isolation, committed reads, and snapshot isolation There is one major problem we didn't discuss called "The Lost Update Problem" Consider a read-modify-write transaction, now imagine two of them happening at the same time Even with snapshot isolation, it's possible that read can happen for transaction A before B, but the write for A happens first Incrementing/Decrementing values (counters, bank accounts) Updating complex values (JSON for example) CMS updates that send the full page as an update Solutions: Atomic Writes - Some databases support atomic updates that effectively combine the read and write Cursor Stability - locking the read object until the update is performed Single Threading - Force all atomic operations to happen serially through a single thread Explicit Locking The application can be responsible for explicitly locking objects, placing responsibility in the devs hands This makes sense in certain situations - imagine a multiplayer game where multiple players can move a shared object. It's not enough to lock the data and then apply both updates in order since the shared game world can react. (ie: showing that the item is in use) Detecting Lost Updates Locks can be tricky, what if we reused the snapshot mechanism we discussed before? We're already keeping a record of the last transactionId to modify our data, and we know our current transactionId. What if we just failed any updates where our current transaction id was less than the transactionId of the last write to our data? This allows for naive application code, but also gives you fewer options…retry or give up Note: MySQL's InnoDB's Repeatable Read feature does not support this, so some argue it doesn't qualify as snapshot isolation What if you didn't have transactions? If you didn't have transactions, let alone a snapshot number, you could get similar behavior by doing a compare-and-set Example: update account set balance = 10 where balance = 9 and id = ABC This works best in simple databases that support atomic updates, but not great with snapshot isolation Note: it's up to the application code to check that updates were successful - Updating 0 records is not an error Conflict resolution and replication We haven't talked much about replicas lately, how do we handle lost updates when we have multiple copies of data on multiple nodes? Compare-and-Set strategies and locking strategies assume a single up-to-date copy of the data….uh oh The options are limited here, so the strategy is to accept the writes and have an application process to decide what to do Merge: Some operations, like incrementing a counter, can be safely merged. Riak has special datatypes for these Last Write Wins: This is a common solution. It's simple but inaccurate. Also the most common solution. Write Skew and Phantoms Write skew - when a race condition occurs that allows writes to different records to take place at the same time that violates a state constraint The example given in the book is the on-call doctor rotation If one record had been modified after another record's transaction had been completed, the race condition would not have taken place write-skew is a generalization of the lost update problem Preventing write-skew Atomic single-object locks won't work because there's more than one object being updated Snapshot isolation also doesn't work in many implementations - SQL Server, PostgreSQL, Oracle, and MySQL won't prevent write skew Requires true serializable isolation Most databases don't allow you to create constraints on multiple objects but you may be able to work around this using triggers or materialized views as your constraint They mention if you can't use serializable isolation, your next best option may be to lock the rows for an update in a transaction meaning nothing else can access them while the transaction is open Phantoms causing write skew Pattern The query for some business requirement - ie there's more than one doctor on call The application decides what to do with the results from the query If the application decides to go forward with the change, then an INSERT, UPDATE, or DELETE operation will occur that would change the outcome of the previous step's Application decision They mention the steps could occur in different orders, for instance, you could do the write operation first and then check to make sure it didn't violate the business constraint In the case of checking for records that meet some condition, you could do a SELECT FOR UPDATE and lock those rows In the case that you're querying for a condition by checking on records to exist, if they don't exist there's nothing to lock, so the SELECT FOR UPDATE won't work and you get a phantom write - a write in one transaction changes the search result of a query in another transaction Snapshot isolation avoids phantoms in read-only queries, but can't stop them in read-write transactions Materializing conflicts The problem we mentioned with phantom is there'd no record/object to lock because it doesn't exist What if you were to have a set of records that could be used for locking to alleviate the phantom writes? Create records for every possible combination of conflicting events and only use those to lock when doing a write "materializing conflicts" because you're taking the phantom writes and turning them into lock records that will prevent those conflicts This can be difficult and prone to errors trying to create all the combinations of locks AND this is a nasty leakage of your storage into your application Should be a last resort Resources We Like The 12 Factor App and Google Cloud (cloud.google.com) Tip of the Week Docker's Buildkit is their backend builder that replaces the "legacy" builder by adding new non-backward compatible functionality. The way you enable buildkit is a little awkward, either passing flags or setting variables as well as enabling the features per Dockerfile, but it's worth it! One of the cool features is the "mount" flag that you can pass as part of a RUN statement to bring in files that are not persisted past that layer. This is great for efficiency and security. The "cache" type is great for utilizing Docker's cache to save time in future builds. The "bind" type is nice for mounting files you only need temporarily. like source code in for a compiled language. The "secret" is great for temporarily bringing in environment variables without persisting them. Type "ssh" is similar to "secret", but for sharing ssh keys. Finally "tmpfs" is similar to swap memory, using an in-memory file system that's nice for temporarily storing data in primary memory as a file that doesn't need to be persisted. (github.com) Did you know Google has a Google Cloud Architecture diagramming tool? It's free and easy to use so give it a shot! (cloud.google.com) ChatGTP has an app for slack. It's designed to deliver instant conversation summaries, research tools, and writing assistance. Is this the end of scrolling through hundreds of messages to catch up on whatever is happening? /chatgpt summarize (salesforce.com) Have you heard about ephemeral containers? It's a convenient way to spin up temporary containers that let you inspect files in a pod and do other debugging activities. Great for, well, debugging! (kubernetes.io)

ChatGPT and the Future of Everything

2023-03-0602:00:411

There's this thing called ChatGPT you may have heard of. Is it the end for all software developers? Have we reached the epitome of mankind? Also, should you write your own or find a FOSS solution? That and much more as Allen gets redemption, Joe has a beautiful monologue, and Outlaw debates a monitor that is a thumb size larger than his current setup. If you're in a podcast player and would prefer to read it on the web, follow this link: https://www.codingblocks.net/episode205 News Thank you for the amazing reviews! iTunes: MalTheWarlock, Abdullah Nafees, BarnabusNutslap Orlando Code Camp coming up Saturday March 25th https://orlandocodecamp.com/ ChatGPT Is this the beginning or the end of software development as we know it? Are you using it for work? Does your work have an AI policy? OpenAI has recently announced a whopping 90% price reduction on their ChatGPT and Whisper APi calls $.002 per 1000 ChatGPT tokens $.006 per minute to Whisper You also get $5 in free credit in your first 3 months, so give it a shot! https://openai.com/pricing Roll Your Own vs FOSS This probably isn't the first time and it won't be the last we ask the question - should you write your own version of something if there's a good Free Open Source Software alternative out there? Typed vs Untyped Languages Another topic that we've touched on over the years - which is better and why? Any considerations when working with teams of developers? What are the pros and cons of each? Cloud Pricing If you're spending a good amount of money in the cloud, you should probably talk to a sales rep for your given cloud and try to negotiate rates. You may be surprised how much you can save. And...you never know until you ask! Outlaw has the Itch to get a new Monitor Is it worth upgrading from a 34" ultrawide to a 38" ultrawide? What's a good size for a 4k monitor? Should you even get a 4k monitor? Should you go curved? Some references mentioned during the show NVidia monitor search page: https://www.nvidia.com/en-us/geforce/products/g-sync-monitors/specs/ LG 38" ultrawide: https://amzn.to/3SLeqUO Rtings recommended gaming monitors: https://www.rtings.com/monitor/reviews/best/by-usage/gaming Games Radar best G-Sync monitors: https://www.gamesradar.com/best-g-sync-monitors/ Acer Predator 38" ultrawide: https://amzn.to/3ZBDb80 Samsung Odyssey Neo G9 49" Ultrawide: https://amzn.to/3ZGMTpx LG 49WQ95C-W 49" Ultrawide: https://amzn.to/3mk0TY5 Resources from this episode How to jailbreak ChatGPT - List of Prompts: https://www.mlyearning.org/how-to-jailbreak-chatgpt/ Magazine stops accepting submissions due to bots: https://nypost.com/2023/02/22/sci-fi-magazine-not-accepting-submissions-due-to-bots/ Stack Overflow bans ChatGPT answers: https://www.theverge.com/2022/12/5/23493932/chatgpt-ai-generated-answers-temporarily-banned-stack-overflow-llms-dangers ChatGPT detection tool already out: https://www.ctvnews.ca/sci-tech/cheaters-beware-chatgpt-maker-releases-ai-detection-tool-1.6253847 Tips of the Week Did you know that the handy, dandy application jq is great for formatting json AND it's also Turing complete? You can do full on programming inside jq to make changes - conditionals, variables, math, filtering, mapping...it's Turing Complete! https://stedolan.github.io/jq/ Want to freshen up your space, but you just don't have the vision? Give interiorai.com a chance, upload a picture of your room and give it a description. It works better than it should. You can sort your command line output when doing something like an ls sort -k2 -b On macOS you can drag a non-fullscreen window to a fullscreen desktop When using the ls -l command in a terminal, that first numeric column shows the number of hard links to a file - meaning the number of names an inode has for that file Argument parser for Python 3 - makes parsing command line arguments a breeze and creates beautiful --help documentation to boot! https://docs.python.org/3/library/argparse.html .NET has an equivalent parser we've mentioned in the past https://www.nuget.org/packages/NuGet.CommandLine

2023 Holiday Season Developer Shopping List

2023-11-2502:28:51

To see all the items on 2023's holiday shopping list, head over to https://www.codingblocks.net/episode223

Open Telemetry - Instrumentation and Metrics

2023-10-2901:13:07

https://www.codingblocks.net/episode221

When to Log Out

2024-10-0701:03:141

Well, this is awkward. Coding Blocks is signing out for now, in this episode we'll talk about what's happening and why. We have had an amazing run, far better than we ever expected. Also, Joe recommends 50 games, Allen goes for the gold, and Outlaw is totally normal. (And we're not crying you're crying!) Thank you for the support over the last 11 (!!!) years. It's been a wild ride, and the last thing we ever expected when starting a tech podcast was getting to meet so many fantastic people. View the full show notes here: https://www.codingblocks.net/episode242 Tip of the Week UFO 50 is an odd collection of 50 pseudo-retro video games made by a small group of game developers, most notably including Derek Yu of Spelunky. It's a unique and specific experience that reminds me of spending the night at your friend's house who had some console gaming system that you'd only ever heard rumors about. The games seem small and simple at first blush, but there is surprising depth. Favorites so far are Kick Club, Avianos, Attactics, and Mortol. (Steam) Use JSDoc annotations to make VSCode "understand" your data (jsdoc.app) Can you change your password without needing current password? (askubuntu.com) Did you know you can use VS Code for interactive rebasing? How to enable VS Code Interactive Editor (StackOverflow) GitLens (marketplace.visualstudio.com)

Things to Know when Considering Multi-Tenant or Multi-Threaded Applications

2024-09-0201:58:45

For the full show notes head over to: https://www.codingblocks.net/episode241

Two Water Coolers Walk Into a Bar

2024-08-1801:33:43

Grab your headphones because it's water cooler time! In this episode we're catching up on feedback, putting our skills to the test, and wondering what we're missing. Plus, Allen's telling it how it is, Outlaw is putting it all together and Joe is minding the gaps! View the full show notes here: https://www.codingblocks.net/episode240 Reviews Thank you again for taking the time to share your review with us! iTunes: Yesso95 Spotify: Auxk0rd, artonus News Atlanta Dev Con September 7th, 2024 https://www.atldevcon.com/ DevFest Central Florida September 28th, 2024 https://devfestflorida.com/ Two water coolers walk into a bar... Several folks share their origin stories in the Coding Blocks slack - especially in episode-discussion Example of dealing with legacy code / hiring people that will work on it (Episode 239) Intentional architecture…what's the worst that could happen? What's the sentiment like on Hacker News? (outerbounds.com) Cat8 is not small! Why isn't anything easy? Kubernetes trivia, where are your blind spots? (proprofs.com) Ask Claude: Can you give me an example of the kinds of competitions that might exist in a humorous version of the Olympics for programmers? Data gathering and parsing - it doesn't seem to have gotten much better in decades…are we wrong? Tip of the Week 8 Top Docker Tips and Tricks for 2024 (docker.com) Have you tried Eartlhy, like Dockerfiles for all of your builds that you can run locally? (earthly.dev) Java's JavaAgent Explained (bito.ai) Mirrord is an alternative to Telepresence that makes working with Kubernetes easier (mirrord.dev) Kubernetes + Skaffold + Telepresence + K9s = Winning, it's a great combination of tools that work great together! https://cloud.google.com/kubernetes-engine?hl=en https://skaffold.dev/ https://www.telepresence.io/ https://k9scli.io/

How did We Even Arrive Here?

2024-08-0401:37:14

For the full show notes please visit: https://www.codingblocks.net/episode239

AI, Blank Pages, and Client Libraries...oh my!

2024-07-0701:47:21

It's Water Cooler Time! We've got a variety of topics today, and also Outlaw's lawyering up, Allen can read QR codes now, and Joe is looking at second careers. View the full show notes here: https://www.codingblocks.net/episode238 News As always, thank you for leaving us a review – we really appreciate them! Almazkun, vassilbakalov, DzikijSver Atlanta Dev Con September 7th, 2024 https://www.atldevcon.com/ DevFest Central Florida on September 28th, 2024 Interested? Submit your talk proposal here: https://sessionize.com/devfest-florida-orlando-2024/ Water Cooler How many programmers are there now? (statista.com) Are we still growing? What will it be like when we stop growing? What will people be doing instead? AI music generators are being sued! (msn.com) Curse of the Blank Page Naming things is important, gives them power…but also the power to defeat them! Don't make any one specific technology your hammer Client libraries that completely change with server upgrades What's the most important or relevant thing to learn as a developer now? Do you research or learn on vacation? Tip of the Week Curated, High-Quality Stories, Essays, Editorials, and Podcasts based around Software Engineering. It's more polished and less experimental than PagedOut (Github) Also, there's a new Paged Out, complete with downloadable art. It's more avant-garde than GIthub's Readme project, featuring articles on Art, Cryptography, Demoscenes, and Reverse Engineering. (pagedout.institute) Travel Router - Extensible Authentication Protocol (EAP) is used to pass the authentication information between the supplicant (the Wi-Fi workstation) and the authentication server (Microsoft IAS or other) (Amazon) Comparison of Travel Routers (gi.inet.com) Carrying case for router (Amazon) Travel power cube - 6 power outlets followed by 3 (Amazon) Did you know you that Windows has a built in camera QR code reader? Guava caching libraries in Java (Github) Caffiene is a more recent alternatitive (Github) Generative AI for beginners - "Learn the fundamentals of building Generative AI applications with our 18-lesson comprehensive course by Microsoft Cloud Advocates." Microsoft has a course for getting into generative AI! (microsoft.github.io) Claude is better than Chat GPT? (claude.ai) How to Get the Most out of Postgres Memory Settings - thanks Mikerg! (temb.io)

Alternatives to Administering and Running Apache Kafka

2024-06-2301:05:15

View the show notes on the web: https://www.codingblocks.net/episode237 In the past couple of episodes, we'd gone over what Apache Kafka is and along the way we mentioned some of the pains of managing and running Kafka clusters on your own. In this episode, we discuss some of the ways you can offload those responsibilities and focus on writing streaming applications. Along the way, Joe does a mighty fine fill-in for proper noun pronunciation and Allen does a southern auctioneer-style speed talk. Reviews As always, thank you for leaving us a review - we really do appreciate them! From iTunes: Abucr7 Upcoming Events Atlanta Dev Con September 7th, 2024 https://www.atldevcon.com/ DevFest Central Florida on September 28th, 2024 Interested? Submit your talk proposal here: https://sessionize.com/devfest-florida-orlando-2024/ Kafka Compatible and Kafka Functional Alternatives Why? Because running any type of infrastructure requires time, knowledge, and blood, sweat and tears Confluent https://www.confluent.io/confluent-cloud/pricing/ We've personally had good experiences with their Kafka as a service WarpStream https://www.warpstream.com/ "WarpStream is an Apache Kafka® compatible data streaming platform built directly on top of object storage: no inter-AZ bandwidth costs, no disks to manage, and infinitely scalable, all within your VPC" ZERO disks to manage 10x cheaper than running Kafka Agents stream data directly to and from object storage with no buffering on local disks and no data tiering. Create new serverless "Virtual Clusters" in our control plane instantly Support different environments, teams, or projects without managing any dedicated infrastructure Things you won't have to do with WarpStream Upscale a cluster that is about to run out of space Figure out how to restore quorum in a Zookeeper cluster or Raft consensus group Rebalance partitions in a cluster "WarpStream is protocol compatible with Apache Kafka®, so you can keep using all your favorite tools and software. No need to rewrite your application or use a proprietary SDK. Just change the URL in your favorite Kafka client library and start streaming!" Never again have to choose between reliability and your budget. WarpStream costs the same regardless of whether you run your workloads in a single availability zone, or distributed across multiple WarpStream's unique cloud native architecture was designed from the ground up around the cheapest and most durable storage available in the cloud: commodity object storage WarpStream agents use object storage as the storage layer and the network layer, side-stepping interzone bandwidth costs entirely Can be run in BYOC (bring your own cloud) or in Serverless BYOC - you provide all the compute and storage - the only thing that WarpStream provides is the control plane Data never leaves your environment Serverless - fully managed by WarpStream in AWS - will automatically scale for you even down to nothing! Can run in AWS, GCP and Azure Agents are also S3 compatible so can run with S3 compatible storage such as Minio and others RedPanda Redpanda is a slimmed down native Kafka protocol compliant drop-in replacement for Kafka There's even a Redpanda Connect! It's main differentiator is performance, it's cheaper and faster Apache Pulsar Similar to Kafka, but changes the abstraction on storage to allow more flexibility on IO Has a Kafka compliant wrapper for interchangability Simple data offload functionality to S3 or GCS Multi tenancy Geo replication Cloud alternatives Google Cloud - PubSub https://cloud.google.com/pubsub Azure - Event Hubs https://azure.microsoft.com/en-us/products/event-hubs AWS - Kinesis https://aws.amazon.com/kinesis/ Tip of the Week Chord AI is an Android/iOS app that uses AI to figure out the chords for a song. This is really useful if you just want to get the quick jist of a song to play along with. The base version is free, and has a few different integration options (YouTube, Spotify, Apple Music Local Files for me) and it uses your phones microphone and a little AI magic to figure it out. It even shows you how to play the chords on guitar or piano. The free version gets you basic chords, but you can pay $8.99 a month to get more advanced/frequent chords. https://www.chordai.net/ Pandas is nearly as good, if not better than SQL for exploring data https://pandas.pydata.org/ Another tip for displaying in Jupyter notebooks - to HTML() your dataframes to show the full column data https://www.geeksforgeeks.org/how-to-render-pandas-dataframe-as-html-table/ Take photos or video and convert them into 3d models https://lumalabs.ai/luma-api

Nuts and Bolts of Apache Kafka

2024-06-0901:37:26

Topics, Partitions, and APIs oh my! This episode we're getting further into how Apache Kafka works and its use cases. Also, Allen is staying dry, Joe goes for broke, and Michael (eventually) gets on the right page. The full show notes are available on the website at https://www.codingblocks.net/episode236 News Thanks for the reviews! angingjellies and Nick Brooker Please leave us a review! (/review) Atlanta Dev Con is coming up, on September 7th, 2024 (www.atldevcon.com) Kafka Topics They are partitioned - this means they are distributed (or can be) across multiple Kafka brokers into "buckets" New events written to Kafka are appended to partitions The distribution of data across brokers is what allows Kafka to scale so well as data can be written to and read from many brokers simultaneously Events with the same key are written to the same partition as the original event Kafka guarantees reads of events within a partition are always read in the order that they were written For fault tolerance and high availability, topics can be replicated…even across regions and data centers NOTE: If you're using a cloud provider, know that this can be very costly as you pay for inbound and outbound traffic across regions and availability zones Typical replication configurations for production setups are 3 replicas Kafka APIS Admin API - used for managing and inspecting topics, brokers, and other Kafka objects Producer API - used to write events to Kafka topics Consumer API - used to read data from Kafka topics Kafka Streams API - the ability to implement stream processing applications/microservices. Some of the key functionality includes functions for transformations, stateful operations like aggregations, joins, windowing, and more In the Kafka streams world, these transformations and aggregations are typically written to other topics (in from one topic, out to one or more other topics) Kafka Connect API - allows for the use of reusable import and export connectors that usually connect external systems. These connectors allow you to gather data from an external system (like a database using CDC) and write that data to Kafka. Then you could have another connector that could push that data to another system OR it could be used for transforming data in your streams application These connectors are referred to as Sources and Sinks in the connector portfolio (confluent.io) Source - gets data from an external system and writes it to a Kafka topic Sink - pushes data to an external system from a Kafka topic Use Cases Message queue - usually talking about replacing something like ActiveMQ or RabbitMQ Message brokers are often used for responsive types of processing, decoupling systems, etc. - Kafka is usually a great alternative that scales, generally has faster throughput, and offers more functionality Website activity tracking - this was one of the very first use cases for Kafka - the ability to rebuild user actions by recording all the user activities as events How and why Kafka was developed (LinkedIn) Typically different activity types would be written to different topics - like web page interactions to one topic and searches to another Metrics - aggregating statistics from distributed applications Log aggregation - some use Kafka for storage of event logs rather than using something like HDFS or a file server or cloud storage - but why? Because using Kafka for the event storage abstracts away the events from the files Stream processing - taking events in and further enriching those events and publishing them to new topics Event sourcing - using Kafka to store state changes from an application that are used to replay the current state of an object or system Commit log - using Kafka as an external commit log is a way for synchronizing data between distributed systems, or help rebuild the state in a failed system https://youtu.be/IuUDRU9-HRk Tip of the Week Rémi Gallego is a music producer who makes music under a variety of names like The Algorithm and Boucle Infini, almost all of it is instrumental Synthwave with a hard-rock edge. They also make a lot of video game music, including 2 of my favorite game soundtracks of all time "The Last Spell" and "Hell is for Demons" (YouTube) Did you know that the Kubernetes-focused TUI we've raved about before can be used to look up information about other things as well, like :helm and :events. Events is particularly useful for figuring out mysteries. You can see all the "resources" available to you with "?". You might be surprised at everything you see (pop-eye, x-ray, and monitoring) WarpStream is an S3 backed, API compliant Kafka Alternative. Thanks MikeRg! (warpstream.com) Cloudflare's trillion message Kafka setup, thanks Mikerg! (blog.bytebytego.com) Want the power and flexibility of jq, but for yaml? Try yq! (gitbook.io) Zenith is terminal graphical metrics for your *nix system written in Rust, thanks MikeRg! (github.com) 8 Big (O)Notation Every Developer should Know (medium.com) Another Git cheat sheet (wizardzines.com)

Intro to Apache Kafka

2024-05-2602:04:48

We finally start talking about Apache Kafka! Also, Allen is getting acquainted with Aesop, Outlaw is killing clusters, and Joe is paying attention in drama class. The full show notes are available on the website at https://www.codingblocks.net/episode235 News Atlanta Dev Con is coming up, on September 7th, 2024 (www.atldevcon.com) Intro to Apache Kafka What is it? Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Core capabilities High throughput - Deliver messages at network-limited throughput using a cluster of machines with latencies as low as 2ms. Scalable - Scale production clusters up to a thousand brokers, trillions of messages per day, petabytes of data, and hundreds of thousands of partitions. Elastically expand and contract storage and processing Permanent storage - Store streams of data safely in a distributed, durable, fault-tolerant cluster. High availability - Stretch clusters efficiently over availability zones or connect separate clusters across geographic regions. Ecosystem Built-in stream processing - Process streams of events with joins, aggregations, filters, transformations, and more, using event-time and exactly-once processing. Connect to almost anything - Kafka's out-of-the-box Connect interface integrates with hundreds of event sources and event sinks including Postgres, JMS, Elasticsearch, AWS S3, and more. Client libraries - Read, write, and process streams of events in a vast array of programming languages Large ecosystem of open source tools - Large ecosystem of open source tools: Leverage a vast array of community-driven tooling. Trust and Ease of Use Mission critical - Support mission-critical use cases with guaranteed ordering, zero message loss, and efficient exactly-once processing. Trusted by thousands of organizations - Thousands of organizations use Kafka, from internet giants to car manufacturers to stock exchanges. More than 5 million unique lifetime downloads. Vast user community - Kafka is one of the five most active projects of the Apache Software Foundation, with hundreds of meetups around the world. What is it? Getting data in real-time from event sources like databases, sensors, mobile devices, cloud services, applications, etc. in the form of streams of events. Those events are stored "durably" (in Kafka) for processing, either in real-time or retrospectively, and then routed to various destinations depending on your needs. It's this continuous flow and processing of data that is known as "streaming data" How can it be used? (some examples) Processing payments and financial transactions in real-time Tracking automobiles and shipments in real time for logistical purposes Capture and analyze sensor data from IoT devices or other equipment To connect and share data from different divisions in a company Apache Kafka as an event streaming platform? It contains three key capabilities that make it a complete streaming platform Can publish and subscribe to streams of events Can store streams of events durably and reliably for as long as necessary (infinitely if you have the storage) To process streams of events in real-time or retrospectively Can be deployed to bare metal, virtual machines or to containers on-prem or in the cloud Can be run self-managed or via various cloud providers as a managed service How does Kafka work? A distributed system that's composed of servers and clients that communicate using a highly performant TCP protocol Servers Kafka runs as a cluster of one or more servers that can span multiple data centers or cloud regions Brokers - these are a portion of the servers that are the storage layer Kafka Connect - these are servers that constantly import and export data from existing systems in your infrastructure such as relational databases Kafka clusters are highly scalable and fault-tolerant Clients Allows you to write distributed applications that allow to read, write and process streams of events in parallel that are fault-tolerant and scale These clients are available in many programming languages - both the ones provided by the core platform as well as 3rd party clients Concepts Events It's a record of something that happened - also called a "record" in the documentation Has a key Has a value Has an event timestamp Can have additional metadata Producers and Consumers Producers - these are the client applications that publish/write events to Kafka Consumers - these are the client applications that read/subscribe to events from Kafka Producers and consumers are completely decoupled from each other Topics Events are stored in topics Topics are like folders on a file system - events would be the equivalent of files within that folder Topics are mutli-producer and multi-subscriber There can be zero, one or many producers or subscribers to a topic that write to or read from that topic respectively Unlike many message queuing systems, these events can be read from as many times as necessary because they are not deleted after being consumed Deleting of messages is handled on a per topic configuration that determines how long events are retained Kafka's performance is not dependent on the amount of data nor the duration of time data is stored, so storing for longer periods is not a problem Tip of the Week Flipper Zero is a multi-functional interaction device mixed with a Tamagotchi. It has a variety of IO options built in, RFID, NFC, GPIO, Bluetooth, USB, and a variety of low-voltage pins like you'd see on an Arduino. Using the device upgrades the dolphin, encouraging you to try new things…and it's all open-source with a vibrant community behind it. (shop.flipperzero.one) 7 cool and useful things to do with your Flipper Zero Kafka Tui?! Kaskade is a cool-looking Kafka TUI that has got to be better than using the scripts in the build folder that comes with Kafka. (github.com/sauljabin/kaskade) Microstudio is a web-based integrated development environment for making simple games and it's open source! (microstudio.dev) Bing Copilot has a number of useful prompts (bing.com) Designer (photos) Vacation Planner Cooking assistant Fitness trainer Sharing metrics between projects in GCP, Azure, and maybe AWS??? GCP (projects): (cloud.google.com) Azure (resource groups or subscriptions): (learn.microsoft.com) AWS (multiple accounts): (docs.aws.amazon.com) Checking wifi in your home - Android Only (play.google.com) Powering POE without running cables (Amazon) Omada specific - cloud vs local hardware (Amazon) How to "shutdown" a Kafka cluster in Kubernetes: kubectl annotate kafka my-kafka-cluster strimzi.io/pause-reconciliation="true" --context=my-context --namespace=my-namespace kubectl delete strimzipodsets my-kafka-cluster --context=my-context --namespace=my-namespace Then to "restart" the cluster: kubectl annotate kafka my-kafka-cluster strimzi.io/pause-reconciliation- --context=my-context --namespace=my-namespace

StackOverflow AI Disagreements, Kotlin Coroutines and More

2024-05-1301:41:39

https://www.codingblocks.net/episode234 Reviews iTunes: ivan.kuchin News Atlanta Dev Con September 7th, 2024 https://www.atldevcon.com/ Topics People trying to remove their answers from StackOverflow to not allow OpenAI to use their answers without permission/recognition? https://www.tomshardware.com/tech-industry/artificial-intelligence/stack-overflow-bans-users-en-masse-for-rebelling-against-openai-partnership-users-banned-for-deleting-answers-to-prevent-them-being-used-to-train-chatgpt Obfuscate data dumps with PostgreSQL https://github.com/GreenmaskIO/greenmask/ Kotlin Coroutines https://kotlinlang.org/docs/coroutines-overview.html https://kotlinlang.org/docs/coroutine-context-and-dispatchers.html#dispatchers-and-threads Reminded Outlaw of the Cloudflare Workers we mentioned a while back https://developers.cloudflare.com/workers/ Please leave us a review! https://www.codingblocks.net/review You can control if YouTube keeps track of your history (at least that you can see) 100 Things You Didn't Know About Kubernetes https://www.devopsinside.com/100-things-you-didnt-know-about-kubernetes-part-1/ Do the IDE AI's really make you more productive? Random Bits Tesla Las Vegas Loop https://www.lvcva.com/vegas-loop/ What actually happens when you overfill the oil in a vehicle? https://www.youtube.com/watch?v=VaTbfvzNbxQ Fisker Ocean totalled after a $900 door ding...really https://jalopnik.com/fisker-ocean-totaled-over-910-door-ding-after-insurer-1851451187 A Ford Mustang painted with the blackest black paint available https://youtu.be/Ll27OkWuE1g Tip of the Week Docker Blog is pretty excellent https://www.docker.com/blog/ Car Research Car reliability information https://www.truedelta.com/ Actual problems logged with car models by year https://www.carcomplaints.com/ Great search engine for finding cars and more metadata about the listing like how long the car has been listed https://caredge.com/ Utilizing wood sheet goods by utilizing cut lists https://www.opticutter.com/cut-list-optimizer Docker's chicken-n-egg problem Use a multi-stage Dockerfile where an earlier stage has the tools you need Manually dearmor a PGP public key (Hint: it's the opposite of: https://superuser.com/questions/764465/how-to-ascii-armor-my-public-key-without-installing-gpg) Download the file using the server suggested name With wget ... --content-disposition https://man7.org/linux/man-pages/man1/wget.1.html Wth curl ... -JO -J, --remote-header-name -O, --remote-name https://curl.se/docs/manpage.html#-J

Llama 3 is Here, Spending Time on Environmental Setup and More

2024-04-2801:33:37

Full episode show notes can be found at: https://www.codingblocks.net/episode233

Ktor, Logging Ideas, and Plugin Safety

2024-04-1401:38:39

Picture, if you will, a nondescript office space, where time seems to stand still as programmers gather around a water cooler. Here, in the twilight of the workday, they exchange eerie tales of programming glitches, security breaches, and asynchronous calls. Welcome to the Programming Zone, where reality blurs and (silent) keystrokes echo in the depths of the unknown. Also, Allen is ready to boom, Outlaw is not happy about these category choices, and Joe takes the easy (but not longest) road. The full show notes are available on the website at https://www.codingblocks.net/episode232 News Thanks for the reviews! Want to help us out? Leave a review! (/reviews) ivan.kuchin, Nick Brooker, Szymon, JT, Scott Harden Text replacements are tricky, replacing links to "twitter.com" with "x.com" enabled a wave of domain spoofing attacks. (arstechnica.com) Around the Water Cooler Ktor is an asynchronous web framework based on Kotlin, but can it compete with Spring? (ktor.io) docker init is a great tool for getting started, but how much can you expect from a scaffolding tool? (docs.docker.com) Logging, how much is too much? What if we could go back in time? Boomer Hour: Let's talk about GChat UX What do you know about browser extensions? ViolentMonkey is a modern remake of the infamous GreaseMonkey, but can you trust it? (chromewebstore.google.com) Can you trust any extensions? XZ Tools backdown timeline, wow (arstechnica.com) Bookmarklets still rock! (freecodecamp.org) Silent Key Tester for mechanical keyboards, you can specify a wide variety of switches (thockking.com) Joe's preferences: Durock Shrimp Silent T1 Tactile Gazzew Boba U4 Silent Liner Kailh Silent Brown Linear Lichicx Lucy Silent Linear WS Wuque Studio Gray Silent Tactile WS Wuque Studio White Silent - Linear Tactile Kailh Silent Pink Linear Cherry MX Silent Red Tip of the Week Feeling nostalgic for the original GameBoy or GameBoy Color? GBStudio is a one-stop shop for making games, it's open-source and fully featured. You can do the art, music, and programming all in one tool and it's thoughtfully laid out and well-documented. Bonus…you games will work in GameBoy emulators AND you can even produce your own working physical copies. (If you don't want the high-level tools you can go old skool with "GBDK" too) (gbstudio.dev) If you're going to do something, why not script it? If you're going to script it, save it for next time! Dave's Garage is a YouTube channel that does deep dives into Windows internals, cool electronics projects, and everything in between! (YouTube)

Importance of Data Structures, Bad Documentation and Comments and More

2024-04-0101:40:43

Full show notes at: https://www.codingblocks.net/episode231

Decorating your Home Office

2024-03-1801:21:18

This time we are missing the "ocks", but we hope you enjoy this off...ice topic chat about personalizing our workspaces. Also, Joe had to put a quarter in the jar, and Outlaw needs a cookie. The full show notes are available on the website at https://www.codingblocks.net/episode230 News Thank you for the review Szymon! Want to leave us a review? Decorating your Home Office Joe's Uplift Desk Review Mounting monitors, is there any other way? To grommet or not to grommet? How many keys do you want on your keyboard? Wired vs Wireless About that "fn" key… Reddit for inspiration? Office-Appropriate Art Paintings Prints / Silk Screens / Photography Sculptures Book Cases There's a story for Outlaw about this print: https://www.johndyerbaizley.com/product/four-horsemen-full-color-ap Tip of the Week If you have a car, you should consider getting a Mirror Dash Cam. It's a front and rear camera system that replaces your rearview mirror with a touchscreen. Impress all your friends with your recording, zoom, night vision, parking assistance, GPS, and 24/7 recording and monitoring. (Amazon) Be careful about exercising after you give blood, else you might end up needing it back! (redcrossblood.org )

Multi-Value, Spatial, and Event Store Databases

2024-03-0401:07:14

We are mixing it up on you again, no Outlaw this week, but we can offer you some talk of exotic databases. Also, Joe pronounces everything correctly and Allen leaves you with a riddle. The full show notes are available on the website at https://www.codingblocks.net/episode229 News Thanks for the reviews! ivan.kuchin (has taken the lead!), Yoondoggy, cykoduck, nehoraigold Want to help us out? Leave a review! (reviews) Multivalue DBMS Popular: 86. Adabas, 87. UniData/UniVerse, 147. JBase Similar to RDBMS - store data in tables Store multiple values to a particular record's attribute Some RDBMS's can do this as well, BUT it's typically an exception to the rule when you'd store an array on an attribute In a MultiValue DBMS - that's how you SHOULD do it Part of the reason it's done this way is these database systems are not optimized for JOINS Looked at the Adabas and UniData sites - the primary selling points seem to be rapid application development / ease of learning and getting up to speed as well as data modeling that closely mirrors your application data structures I BELIEVE it's a schema on write (docs.rocketsoftware.com) Supposed to be very performant as you access the data the way your application expects it Per the docs, it's easy to maintain (Wikipedia) Spatial DBMS Popular: 29. PostGIS, 59. Aerospike, 136. SpatiaLite Provides the ability to efficiently store, modify, and query spatial data - data that appears in a geometrical space (maps, polygons, etc) Generally have custom data types for storing the spatial data Indices that allow for quick retrieval of spatial data about other spatial data Also allow for performing spatial-specific operations on data, such as computing distances, merging or intersecting objects or even calculating areas Geospatial data is a subset of spatial data - they represent places / spatial data on the Earth's surface Spatio-temporal data is another variation - spatial data combined with timestamps PostGIS - basically a plugin for PostgreSQL that allows for storing of spatial data Additionally supports raster data - data for things like weather and elevation If you want to learn how to use it and understand the data and what's stored (postgis.net) Spatial data types are: point, line, polygon, and more…basically shapes Rather than using b-tree indexes for sorting data for fast retrieval, spatial indexes that are bounding boxes - rectangles that identify what is contained within them Typically accomplished with R-Tree and Quadtree implementations RedFin - a real estate competitor to realtor.com and others, uses PostgreSQL / PostGIS Quite a bit of software that supports OpenGIS so may be a good place to start if you're interested in storing/querying spatial data Event Stores Popular: 178. EventStoreDB, 336. IBM DB2 Event Store, 338. NEventStore Used for implementing the concept of Event Sourcing Event Sourcing - an application/data store where the current state of an object is obtained by "replaying" all the events that got it to its current state This contrasts with RDBMS's in that relational typically store the current state of an object - historical state CAN be stored, but that's an implementation detail that has to be implemented, such as temporal tables in SQL Server or "history tables" Only support adding new events and querying the order of events Not allowed to update or delete an event For performance reasons, many Event Store databases support snapshots for holding materialized states at points in time EventStoreDB - https://www.eventstore.com/eventstoredb Defined as an "immutable log" Features: guaranteed writes, concurrency model, granulated stream and stream APIs Many client interfaces: .NET, Java, Go, Node, Rust, and Python Runs on just about all OSes - Windows, Mac, Linux Highly available - can run in a cluster Optimistic concurrency checks that will return an error if a check fails "Projections" allow you to generate new events based off "interesting" occurrences in your existing data For example. You are looking for how many Twitter users said "happy" within 5 minutes of the word "foo coffee shop" and within 2 minutes of saying "London". Highly performant - 15k writes and 50k reads per second Resources we like Database Rankings (db-engines.com) Tip of the Week If your internet connection is good, but your cell phone service is bad then you might want to consider Ooma. Ooma sells devices that plug into your network or connect wireless and provide a phone number, and a phone jack so you can hook up an an old school home telephone. We've using it for about a week now with no problems and it's been a breeze to set up. The devices range from $99 to $129 and there's a monthly "premier" plan you can buy with nifty features like a secondary phone line, advanced call blocking, and call forwarding. (ooma.com) Why use "git reset --hard" when you can "git stash -u" instead? Reset is destructive, but stashing keeps your changes just in case you need them. Because sometimes, your "sometimes" is now! 🚫 "git reset --hard". ✅ "git stash -u"

Overview of Object Oriented, Wide Column, and Vector Databases

2024-02-1902:03:38

Show notes at https://www.codingblocks.net/episode228

Picking the Right Database Type - Tougher than You Think

2024-02-0402:10:55

For the full show notes, head to: https://www.codingblocks.net/episode227

Comments (16)

Philip C

sad to hear about the coding block break (end?). thanks for the great episodes and tips of the week. I'll probably recycle the episode once in awhile to hear relevant topics. enjoy the break

Oct 8th

Holt Robyn

I like this song very much I often listen to them when playing games.You can play Among Us https://amonguscombo.com/ I find this game very fun and easy

May 28th

Charley

This is the best programming/software development podcast I've ever listened to. I'm new to programming and although some topics are quite in depth, it's still easy to follow along and gives you so much more to look into and learn more about after listening . Allen, Michael and Joe are really funny and I'm always laughing while listening. Thanks guys

Apr 26th

Tanya Georgieva

I would like to leave a more recent comment, I am listening to this podcast for about an year now, still trying to catch up with all old episodes, but I wanted to underline that one can always learn from these guys, Joe, Allen and Michael, starting from development and ending with the Shopping Spree..My list with stuff to learn/buy grows and grows. Thanks a lot :)

Aug 30th

Justin Anderson

I started this podcast 8 months ago and am hooked. These guys breach subjects that rarely get aired in such a forum and do so in a way that is easy to digest and fun to listen to. They are constantly participating in free give aways, sharing with the community the gems of literature they find. Kudos guys! I won't say no to a free book, especially this one "Designing Data-Intensive Applications" should you happen up this. Keep it up!

Nov 25th

Reply (1)

Daniel Rivero Padilla

Funny, guys you didn't do any mention yet of Emacs, and also in the next episode (which I listened before) about generating code, no mention of macros in Lisp, which is one of the most powerful ways of generating code that exist.

Aug 13th

Alessandro Cerasino

this podcast is fantastic.

Dec 30th

Sankara Subramanian

Dec 5th

Javier Pazos

great podcast about software development. I love how candid is the show. I learn a lot and is also great hear what other developers do and how they think

Oct 1st

Daniel Eguia

another great one, thanks for all the confusion.

Jul 18th

Timothy Halse

can you please put the show notes on here if possible

Apr 24th

wow.... contains a plethora of code resources. Thanks Guys

Apr 3rd

Trevor Richardson

Solid show. I really appreciate the open discussion on development as a practice and as a profession. Learn something new every episode. Keep up the good work!

Mar 24th

Vadim Gutman

great episode guys

Oct 30th

#box-pro-ellipsis-176610951396867{-webkit-line-clamp:2;}Coding Blocks

Philip C

Holt Robyn

Charley

Tanya Georgieva

Justin Anderson

Daniel Rivero Padilla

Alessandro Cerasino

Sankara Subramanian

Javier Pazos

Daniel Eguia

Timothy Halse

Daniel Eguia

Trevor Richardson

Vadim Gutman