DiscoverData Science Tech Brief By HackerNoon
Data Science Tech Brief By HackerNoon
Claim Ownership

Data Science Tech Brief By HackerNoon

Author: HackerNoon

Subscribed: 23Played: 76
Share

Description

Learn the latest data science updates in the tech world.
137 Episodes
Reverse
This story was originally published on HackerNoon at: https://hackernoon.com/how-to-optimize-your-marketing-budget-using-just-three-letters-mmm. Marketing Mix Modeling is a statistical analysis method used in marketing to determine the optimal allocation of resources. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-science, #marketing-analytics, #machine-learning, #marketing, #marketing-budget, #marketing-mix-modeling, #media-mix-modelling, #adstock-and-saturation, and more. This story was written by: @radiokocmoc_l45iej08. Learn more about this writer by checking @radiokocmoc_l45iej08's about page, and for more stories, please visit hackernoon.com. Marketing Mix Modeling is a statistical analysis method used in marketing to determine the optimal allocation of resources. The goal of media mix modelling is to understand the impact of different marketing channels on the overall campaign effectiveness. Join me to discover how to optimise the marketing budget by implementing Robyn MMM.
This story was originally published on HackerNoon at: https://hackernoon.com/heres-how-sharechat-scaled-their-ml-feature-store-1000x-without-scaling-the-database. How ShareChat scaled its ML feature store to 1B features/sec on ScyllaDB, achieving 1000X performance without scaling the database. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #sharechat-ml-feature-store, #scylladb-scaling-case-study, #ml-feature-store-optimization, #sharechat-moj, #low-latency-ml-infrastructure, #scylladb-database-optimization, #p99-conf-sharechat-talk, #good-company, and more. This story was written by: @scylladb. Learn more about this writer by checking @scylladb's about page, and for more stories, please visit hackernoon.com. ShareChat scaled its ML feature store from failure at 1M features/sec to 1B features/sec using ScyllaDB optimizations, caching hacks, and relentless tuning. By rethinking schemas, tiling, and caching strategies, engineers avoided scaling the database, cut latency, and boosted cache hit rates—proving performance engineering beats brute-force scaling.
This story was originally published on HackerNoon at: https://hackernoon.com/why-you-shouldnt-judge-by-pnl-alone. PnL can lie. This hands-on guide shows traders how hypothesis testing separate luck from edge, with a Python example and tips on how not to fool yourself. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #quantitative-research, #trading, #algorithmic-trading, #pnl, #udge-pnl, #profit-and-loss, #judge-profit-and-loss, #hackernoon-top-story, and more. This story was written by: @ruslan4ezzz. Learn more about this writer by checking @ruslan4ezzz's about page, and for more stories, please visit hackernoon.com. I’ve spent years building and evaluating systematic strategies across highly adversarial markets. When you iterate on a trading system, PnL is the goal but a terrible day-to-day signal. It’s too noisy, too path-dependent, and too easy to cherry-pick. A simple framework—form a hypothesis, measure a test statistic, translate it into a probability under a “no-effect” world (the p-value)—helps you avoid false wins, iterate faster, and ship changes that actually stick. Below I’ll show a concrete example where two strategies look very different in cumulative PnL charts, yet standard tests say there’s no meaningful difference in their average per-trade outcome. I’ll also demystify the t-test in plain language: difference of means, scaled by uncertainty.
This story was originally published on HackerNoon at: https://hackernoon.com/from-decentralized-to-unified-supcon-uses-seatunnel-to-build-an-efficient-data-collection-frame. SUPCON dumped siloed data tools for Apache SeaTunnel—now core sync tasks run 0-failure! Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #bigdata, #apacheseatunnel, #supcon, #data-sync, #high-availability, #data-engineering, #cdc, #hackernoon-top-story, and more. This story was written by: @williamguo. Learn more about this writer by checking @williamguo's about page, and for more stories, please visit hackernoon.com. 99% lower failures, 100% consistency, 70% less O&M cost. Big thanks to @ApacheSeaTunnel!
This story was originally published on HackerNoon at: https://hackernoon.com/enterprise-data-pipeline-revolution-suresh-pallis-metadata-driven-automation-success. Suresh Palli revolutionized enterprise data pipelines with metadata-driven automation, cutting dev time 40% and boosting scalability 5x. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #suresh-palli, #metadata-driven-automation, #enterprise-data-pipelines, #data-pipeline-automation, #metadata-governance, #enterprise-data-architecture, #scalable-data-processing, #good-company, and more. This story was written by: @sanya_kapoor. Learn more about this writer by checking @sanya_kapoor's about page, and for more stories, please visit hackernoon.com. Suresh Palli led a metadata-driven automation project that cut pipeline development time by 40% and scaled data processing 5x. His centralized metadata governance enabled dynamic adaptation, seamless orchestration, and cross-unit alignment. The success earned industry recognition, consulting opportunities, and set new benchmarks for enterprise data automation.
This story was originally published on HackerNoon at: https://hackernoon.com/unified-data-smarter-agentsis-your-architecture-future-proof. A hands-on guide to architecting unified, governed and AI-ready data platforms using open table formats, semantic layers and multicloud governance. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data, #big-data-analytics, #product, #ai, #etl, #azure, #aws, #data-engineering, and more. This story was written by: @@QueryAndConquer. Learn more about this writer by checking @@QueryAndConquer's about page, and for more stories, please visit hackernoon.com. A hands-on guide to architecting unified, governed and AI-ready data platforms using open table formats, semantic layers and multicloud governance.
This story was originally published on HackerNoon at: https://hackernoon.com/data-driven-decisions-at-scale-ab-testing-best-practices-for-engineering-and-data-science-teams. Ship features like scientists: randomize, measure, and learn fast. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-science, #big-data, #experimentation, #experimental-design, #product-development, #software-engineering, #machine-learning, #statistics, and more. This story was written by: @sayantan. Learn more about this writer by checking @sayantan's about page, and for more stories, please visit hackernoon.com. Ship features like scientists: randomize, measure, and learn fast. Good A/B tests aren’t just stats — they’re the engine driving smarter products.
This story was originally published on HackerNoon at: https://hackernoon.com/why-you-should-almost-always-choose-sync-gunicorn-over-workers-ze9c32wj. Anyone working on a WSGI web application frameworks like Flask would know that as a best practice it is very important to use a WSGI HTTP Server like Gunicorn to deploy the app outside your development servers. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #python-programming, #gevent, #gunicorn, #python-web-development, #flask, #flask-deployment, #latest-tech-stories, #what-are-gunicorn-worker-types, and more. This story was written by: @shamik-ray. Learn more about this writer by checking @shamik-ray's about page, and for more stories, please visit hackernoon.com. Gunicorn is a widely popular WSGI Server and its popularity is because it is lightweight, fast, simple yet can support most of the requirements you would have to host an app on production. The default worker type is Sync and I will be arguing for it. Async workers like Gevent create new greenlets (lightweight pseudo threads) Every time a new request comes they are handled by greenlets spawned by the worker threads. At the same time, the resources needed to serve the requests will be less.
This story was originally published on HackerNoon at: https://hackernoon.com/beyond-the-ten-blue-links-how-generative-ai-rewires-our-brains-for-search. The age of searching is ending. A deep dive into the psychology of AI search, how it centralizes truth & why becoming a trusted source is key to brand survival Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #user-behavior-analytics, #ai-integrated-search, #digital-marketing, #seo, #geo, #future-tech, #psychology, #product-management, and more. This story was written by: @a_belova. Learn more about this writer by checking @a_belova's about page, and for more stories, please visit hackernoon.com. Generative AI isn't just a new feature in search; it's a fundamental psychological shift. By providing direct, synthesized answers, it caters to our brain's deep-seated desire to reduce cognitive load and trust authoritative narratives. This "great untraining" is rendering the classic marketing playbook obsolete. For businesses, developers, and marketers, the battle is no longer for clicks on blue links, but for becoming a trusted, citable source inside the AI's "brain." The age of persuasion is ending; the age of becoming a machine-readable source of truth has begun.
This story was originally published on HackerNoon at: https://hackernoon.com/need-web-data-here-are-the-3-methods-everyones-using. Discover the three best, most modern methods to access and harness web data for your projects. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #web-data, #ai, #web-scraping, #sdk, #api, #mcp, #python, #good-company, and more. This story was written by: @brightdata. Learn more about this writer by checking @brightdata's about page, and for more stories, please visit hackernoon.com. Need web data? APIs, SDKs, and MCP provide flexible, scalable, and automated ways to access, scrape, and integrate web data for scripts, backends, web apps, pipelines, or AI agents.
This story was originally published on HackerNoon at: https://hackernoon.com/applying-transitive-closure-to-sort-products-into-categories-considering-nesting-and-overlaps. A guide to efficiently managing nested categories and overlapping products, ensuring fast retrieval without duplicates in e-commerce systems. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-management, #software-architecture, #product-categorization, #graph-theory, #microservices, #optimize-data-storage, #transitive-closure, #advanced-indexing, and more. This story was written by: @egorgrushin. Learn more about this writer by checking @egorgrushin's about page, and for more stories, please visit hackernoon.com. Handling product categorization in e-commerce can be quite the task, especially when nested categories and overlapping products make efficient retrieval without duplicates a real challenge. The method I found has a major impact on performance: setting up proper data storage, separating data for reading and modification, using relational and NoSQL databases, and applying graph theory to handle complex category nesting. The step-by-step guide shows how to sort out efficient data storage, use transitive closure for advanced indexing, build a service to maintain and update the graph, and take advantage of database indexing to avoid unnecessary sorting in RAM.
This story was originally published on HackerNoon at: https://hackernoon.com/98percent-of-data-strategies-fail-lets-fix-it. Learn how to fix failing data strategies using the '5 W's' framework. Transform your approach to KPIs and drive real business value with actionable insights. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-strategy, #kpi-management, #business-intelligence, #data-driven-decisions, #executive-leadership, #analytics-roi, #data-roi, #data-governance, and more. This story was written by: @liorb. Learn more about this writer by checking @liorb's about page, and for more stories, please visit hackernoon.com. Even the most well-equipped organizations can find themselves serving up a mess instead of actionable insights. Here's a step-by-step process of fixing your data strategy, ensuring that you're serving up actionable data instead of a recipe for disaster. In the following sections, we'll dive into the common data strategy nightmares.
This story was originally published on HackerNoon at: https://hackernoon.com/how-to-measure-the-results-of-in-app-events-when-onelinks-dont-work. How To Measure The Results Of In-App Events When Onelinks Don’t Work Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #analytics, #onelink, #inapp-events, #marketing, #app-store, #mobile-apps, #digital-marketing, #good-company, and more. This story was written by: @socialdiscoverygroup. Learn more about this writer by checking @socialdiscoverygroup's about page, and for more stories, please visit hackernoon.com. Many app developers and marketing managers face the challenge of accurately measuring the impact of In-App Events (IAEs) on the App Store. While IAEs have proven effective for re-engaging users, attracting new downloads, and increasing revenue, traditional tracking methods like OneLink don’t actually include IAEs. Major mobile attribution platforms confirm that currently there is no way to track IAEs properly. At Social Discovery Group, our portfolio of 60+ dating and entertainment brands is supported by a team of over 100 marketers dedicated to app growth and development. We’re used to measuring all our marketing efforts in terms of financial value. Eventually, we’ve managed to develop our own composite way to evaluate IAEs, and are going to share it with you.
This story was originally published on HackerNoon at: https://hackernoon.com/how-ai-powered-data-mapping-is-democratizing-data-management. Learn how AI-powered data mapping is transforming data management, making it more accessible and efficient for everyone. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-mapping, #data-management, #big-data, #ai-powered, #ai-powered-data-management, #democratizing-data-management, #data-science, #ai-powered-data-mapping, and more. This story was written by: @kristenburke. Learn more about this writer by checking @kristenburke's about page, and for more stories, please visit hackernoon.com. AI is revolutionizing data mapping by automating and simplifying the process, making data management more efficient and accessible for businesses and non-technical users alike.
This story was originally published on HackerNoon at: https://hackernoon.com/data-engineering-whats-the-value-of-api-security-in-the-generative-ai-era. Discover the importance of API security in the age of Generative AI. Learn how robust API protection ensures data integrity. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-engineering, #generative-ai, #ai-regulation, #api-security, #data-security, #data-privacy, #threat-detection, #cybersecurity-best-practices, and more. This story was written by: @karthikrajashekaran. Learn more about this writer by checking @karthikrajashekaran's about page, and for more stories, please visit hackernoon.com. API security is crucial in the era of Generative AI, ensuring data integrity, protecting user privacy, and enabling secure and efficient AI integration. Robust API protection helps prevent unauthorized access, data breaches, and potential misuse of AI capabilities.
This story was originally published on HackerNoon at: https://hackernoon.com/say-goodbye-to-outdated-diagrams-automate-your-infrastructure-visualization. Automate your infrastructure diagrams. Guide helps you maintain fresh, accurate visuals with minimal effort, perfect for managing Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #visualization, #cloud-infrastructure, #terraform, #diagram, #infrastructure-as-code, #cloud, #aws, #infrastructure-visualization, and more. This story was written by: @vladimirf. Learn more about this writer by checking @vladimirf's about page, and for more stories, please visit hackernoon.com. Tired of making awesome infrastructure diagrams that become outdated as soon as you save them? Yeah, me too. Luckily, there are tools out there to help.
This story was originally published on HackerNoon at: https://hackernoon.com/why-c-suite-executives-wont-cut-it-without-data-skills-anymore. Modern executives must master data skills to navigate data privacy, cybersecurity, and strategic decisions. Learn why C-suite leaders can't afford to lag behind Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-literacy, #thought-leadership, #leadership-skills, #data-skills, #data-governance, #data-visualization-tools, #cybersecurity-executives, #hackernoon-top-story, and more. This story was written by: @znenad079. Learn more about this writer by checking @znenad079's about page, and for more stories, please visit hackernoon.com. Every industry generates massive amounts of data, which is now being used for better decision-making. One of today’s most pressing challenges for executives is data privacy concerns and cybersecurity. Modern executives must have data skills to understand the flow of valuable data in their company and know how to make it work for them.
This story was originally published on HackerNoon at: https://hackernoon.com/meet-new-and-improved-bigquery-single-unified-ai-ready-data-platform. Google has gone a step further and unified key data Google Cloud analytics capabilities under BigQuery - now the single, AI-ready data analytics platform.  Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-analytics, #google-bigquery, #bigquery-and-google-cloud, #ai-integration, #big-query-and-gemini, #good-company, #hackernoon-top-story, #real-time-data-analytics, and more. This story was written by: @googlecloud. Learn more about this writer by checking @googlecloud's about page, and for more stories, please visit hackernoon.com. We’ve gone a step further and unified key data Google Cloud analytics capabilities under BigQuery, which is now the single, AI-ready data analytics platform. BigQuery incorporates key capabilities from multiple Google Cloud analytics services into a single product experience that offers the simplicity and scale you need to manage structured data in BigQuery tables, unstructured data like images, audience and documents, and streaming workloads, all with the best price-performance. 
This story was originally published on HackerNoon at: https://hackernoon.com/decoding-transformers-superiority-over-rnns-in-nlp-tasks. Explore the intriguing journey from Recurrent Neural Networks (RNNs) to Transformers in the world of Natural Language Processing in our latest piece: 'The Trans Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #nlp, #transformers, #llms, #natural-language-processing, #large-language-models, #rnn, #machine-learning, #neural-networks, and more. This story was written by: @artemborin. Learn more about this writer by checking @artemborin's about page, and for more stories, please visit hackernoon.com. Despite Recurrent Neural Networks (RNNs) designed to mirror certain aspects of human cognition, they've been surpassed by Transformers in Natural Language Processing tasks. The primary reasons include RNNs' issues with the vanishing gradient problem, difficulty in capturing long-range dependencies, and training inefficiencies. The hypothesis that larger RNNs could mitigate these issues falls short in practice due to computational inefficiencies and memory constraints. On the other hand, Transformers leverage their parallel processing ability and self-attention mechanism to efficiently handle sequences and train larger models. Thus, the evolution of AI architectures is driven not only by biological plausibility but also by practical considerations such as computational efficiency and scalability.
This story was originally published on HackerNoon at: https://hackernoon.com/how-to-enable-auto-start-for-apache-dolphinscheduler. To set DolphinScheduler to start automatically upon system boot, you typically need to configure it as a system service. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #bigdata, #data-science, #workflow-automation, #linux, #how-to-enable-auto-start, #apache-dolphinscheduler, #apache-dolphinscheduler-guide, and more. This story was written by: @williamguo. Learn more about this writer by checking @williamguo's about page, and for more stories, please visit hackernoon.com. To set DolphinScheduler to start automatically upon system boot, you typically need to configure it as a system service. The following are general steps, which may vary depending on your operating system.
loading
Comments