09:45am - 10:00am | Opening Remarks | Mark Phillips |
---|---|---|
10:00am - 10:45am | Keystone: Between a ROC and a SOFT Place | Pat Helland |
11:00am - 11:45am | LVars: Lattice-based Data Structures for Deterministic Parallelism | Lindsey Kuper |
12:00pm - 12:45pm | Riak Search 2.0 | Eric Redmond |
Lunch | ||
2:00pm - 2:45pm | The Seven-Layer Burrito; Troubleshooting a Distributed Database in Production | Justin Shoffstall and Charlie Voiselle |
3:00pm - 3:45pm | Bad As I Wanna Be: Coordination and Consistency in Distributed Databases | Peter Bailis |
4:00pm - 4:45pm | Bringing Consistency to Riak (Part 2) | Joseph Blomstedt |
5:00pm - 6:30pm | Lightning Talks |
09:45am - 10:00am | Opening Remarks | Mark Phillips |
---|---|---|
10:00am - 10:45am | Keystone: Between a ROC and a SOFT Place | Pat Helland |
11:00am - 11:45am | More Than Just Data: Using Riak Core to Manage Distributed Services | Miles O'Connell |
12:00pm - 12:45pm | Device Based Innovation to Enable Scale Out Storage | James Hughes |
Lunch | ||
2:00pm - 2:45pm | Practicalities of Productionizing Distributed Systems | Jeff Hodges |
3:00pm - 3:45pm | Yuki: Functional Data Structures for Riak | Ryland Degnan |
4:00pm - 4:45pm | Timely Dataflow in Naiad | Derek Murray |
5:00pm - 6:30pm | Lightning Talks |
09:30am - 9:45am | Opening Remarks | Mark Phillips |
---|---|---|
9:45am - 10:30am | Maximum Viable Product | Justin Sheehy |
10:45am - 11:30am | The Raft Consensus Algorithm | Diego Ongaro |
11:45am - 12:30pm | Controlled Epidemics: Riak's New Gossip Protocol and Metadata Store | Jordan West |
Lunch | ||
1:30pm - 2:15pm | Dynamic Dynamos: Comparing Riak and Cassandra | Jason Brown |
2:30pm - 3:15pm | Distributed Systems Archaeology | Michael Bernstein |
3:30pm - 4:15pm | Riak Security; Locking the Distributed Chicken Coop | Andrew Thompson |
4:30pm - 5:15pm | The Tail at Scale: Achieving Rapid Response Times in Large Online Services | Jeff Dean |
09:30am - 9:45am | Opening Remarks | Mark Phillips |
---|---|---|
9:45am - 10:15am | Maximum Viable Product | Justin Sheehy |
10:30am - 11:15am | CRDTs in Production | Jeremy Ong |
11:30am - 12:15pm | Distributing Work Across Clusters: Adventures With Riak Pipe | Susan Potter |
Lunch | ||
1:30pm - 2:15pm | Denormalize This! | Richard Simon and Richard Berglund |
2:30pm - 3:15pm | CRDTs: An Update (or maybe just a PUT) | Sam Elliott |
3:30pm - 4:15pm | Building Next Generation Weather Data Distribution and On-demand Forecast Systems Using Riak | Raja Selvaraj and Arvinda Gillella |
4:30pm - 5:15pm | The Tail at Scale: Achieving Rapid Response Times in Large Online Services | Jeff Dean |
Join the discussion in #riconwest on IRC (freenode).
RICON wouldn't be possible without the support and inspiration of our various sponsors. Take a moment to peruse what they have to offer.
Today’s large-scale web services provide rapid responses to interactive requests by applying large amounts of computational resources to massive datasets. They typically operate in warehouse-sized datacenters and run on clusters of machines that are shared across many kinds of interactive and batch jobs. As these systems distribute work to ever larger numbers of machines and sub-systems in order to provide interactive response times, it becomes increasingly difficult to tightly control latency variability across these machines, and often the 95%ile and 99%ile response times suffer in an effort to improve average response times. As systems scale up, simply stamping out all sources of variability does not work. Just as fault-tolerant techniques needed to be developed when guaranteeing fault-free operation by design became unfeasible, techniques that deliver predictably low service-level latency in the presence of highly-variable individual components are increasingly important at larger scales.
In this talk, I’ll describe a collection of techniques and practices lowering response times in large distributed systems whose components run on shared clusters of machines, where pieces of these systems are subject to interference by other tasks, and where unpredictable latency hiccups are the norm, not the exception. Some of the techniques adapt to trends observed over periods of a few minutes, making them effective at dealing with longer-lived interference or resource contention. Others react to latency anomalies within a few milliseconds, making them suitable for mitigating variability within the context of a single interactive request. I’ll discuss examples of how these techniques are used in various pieces of Google’s systems infrastructure and in various higher-level online services.
Jeff joined Google in 1999 and is currently a Google Fellow in Google's Systems Infrastructure Group. He co-developed the MapReduce computational framework, and is a co-designer and co-implementor of heavily-used distributed storage systems, including BigTable and Spanner. He co-designed and implemented five generations of Google's crawling, indexing, and query serving systems, as well as major pieces of Google's initial advertising and AdSense for Content systems. He has also worked on large-scale machine learning and machine translation software, and has also designed and implemented many of Google's low-level software libraries and developer tools.
Discussing the latest in consensus protocols and distributed databases is extremely sexy, but how do we get those things into production? This talk discusses tactics and strategy for productionizing distributed systems with a little bit about what the future will hold.
Jeff Hodges is a distributed systems engineer at Twitter, Inc. Currently writing code for the web server fronting all of Twitter's web traffic, Jeff has previously fought spam (and other darker aspects of social media), worked on basic storage and infrastructure, and sometimes enjoys being on-call. He recently wrote Notes on Distributed Systems for Young Bloods to give friends and coworkers a leg-up on stuff he wished he had known years ago.
Consensus plays a key role in fault-tolerant distributed systems. The theoretical community has studied it extensively, focusing on the Paxos algorithm in the last decade. Unfortunately, Paxos is difficult to understand, and it must be modified and extended significantly to arrive at a practical system. In response, Professor John Ousterhout and I have developed Raft, a consensus algorithm designed for understandability. It's equivalent to Paxos in fault-tolerance and performance, but it's designed to be as easy to understand as possible, while cleanly addressing all major pieces needed for practical systems. We hope Raft will make consensus available to a wider audience, and that this wider audience will be able to develop a wider variety of higher quality consensus-based systems than are available today.
Diego is a PhD student in Computer Science at Stanford University, where he researches distributed systems with Professor John Ousterhout. His thesis topic involves bridging the theory and practice of consensus algorithms, a fundamental building block for fault-tolerant systems. Along with Professor Ousterhout, he recently co-designed the Raft consensus algorithm with the explicit aim of being easier to understand than existing algorithms. He developed an interest in consensus algorithms and configuration services while working on RAMCloud, a consistent key-value store with five microsecond round-trip access times. RAMCloud makes use of Raft for handling failures in its coordinator server.
Talk details coming soon...
Justin Sheehy is the CTO of Basho Technologies, the company behind the creation of Webmachine and Riak. As Chief Technology Officer, Justin Sheehy directs Basho's technical strategy, including integration with other platforms and new research into storage and distributed systems. Most recently before Basho, he was a principal scientist at the MITRE Corporation and a senior architect for systems infrastructure at Akamai. At both of those companies he focused on multiple aspects of robust distributed systems, including scheduling algorithms, language-based formal models, and resilience.
Keystone is a proposed design that could unify and simplify many forms of storage. We plan to do this by building the storage to expect failure and recover quickly.
ROC (Recovery Oriented Computing) is a collection of research projects led by UC Berkeley from about 2001 to 2006. One of its major premises is that systems can be made more robust by making their components fail quickly if they are in trouble. By focusing on MTTR (Mean-Time-To-Repair) rather than MTBF (Mean-Time-Between-Failures), the entire system can offer better service even while things break. This is the approach taken by the very large web-sites we see today.
SOFT (Storage Over Flaky Technology) is an acronym I made up to fit the talk title. It represents the trend to get ever simpler and less expensive components. For example, consumer grade SSD costs about 1/8th as much as Enterprise Grade SSD (and is denser in the server). Still, Consumer Grade SSD will return an uncorrectable error about 100,000 times as often as Enterprise Grade. By designing the architecture of Keystone to store immutable data and surrounding that data with aggressive error checking in software, we can be extremely confident of detecting an error and fetching the desired data from one of the other places it has been stored.
In this talk, we will outline the types of storage used by Salesforce (in addition to our use of Oracle and SANs for the relational database). We will talk about the architectural principals used to provide performance, correctness, and availability even while the components themselves fail. Finally, we will walk through some math about availability of data which provides some fun perspective on using SOFT.
Pat Helland has been working in distributed systems, databases, transaction processing, scalable systems, and fault tolerance since 1978.
For most of the 1980s, Pat worked at Tandem Computers as the Chief Architect for TMF (Transaction Monitoring Facility), the transaction and recovery engine under NonStop SQL. After 3+ years designing a Cache Coherent Non-Uniform Memory Multiprocessor for HaL Computers (a subsidiary of Fujitsu), Pat moved to the Seattle area to work at Microsoft in 1994. There he was the architect for Microsoft Transaction Server, Distributed Transaction Coordinator, and a high performance messaging system called SQL Service Broker which ships with SQL Server. From 2005-2007, he was at Amazon working on the product catalog and other distributed systems projects including contributing to the original design for Dynamo. After returning to Microsoft in 2007, Pat worked on a number of projects including Cosmos, the scalable "Big Data" plumbing behind Bing. While working on Cosmos, Pat architected both a project to integrate database techniques into the massively parallel computations as well as a very high-throughput event processing engine.
Since early 2012, Pat has worked at Salesforce.com in San Francisco. He is focusing on multi-tenanted database systems and scalable reliable infrastructure for storage.
The past forty years of distributed database design have identified sufficient conditions for guaranteeing application-level correctness criteria, but the coordination mechanisms for enforcing them are frequently expensive. This has led to a proliferation of consistency models and more scalable system designs that may in turn sacrifice correctness. What's actually necessary for applications, and what's the cost?
In this talk, I'll discuss how to reason about the trade-offs between coordination, consistency, latency, and availability, with a focus on practical takeaways from recent research both at Berkeley and beyond. I'll also talk about reconciling "consistency" in NoSQL and ACID databases and explain why, even though you probably didn't beat the CAP Theorem, you (and tomorrow's database designs) may be on to something.'
Peter Bailis is a graduate student of Computer Science at the University of California, Berkeley, where he researches distributed systems and databases with Joe Hellerstein, Ion Stoica, and Ali Ghodsi. His recent interests include distributed data consistency at the intersection of theory and practice and strong semantics in highly available systems.
Parallel programming is notoriously difficult. A fundamental reason for this difficulty is that programs can yield inconsistent answers, or even crash, due to unpredictable interactions between parallel tasks. But it doesn't have to be this way: deterministic-by-construction programming models offer the promise of freedom from subtle, hard-to-reproduce nondeterministic bugs in parallel code.
A common theme that emerges in the study of deterministic-by-construction systems --- from venerable models like Kahn process networks, to modern ones like the Intel Concurrent Collections system and Haskell's monad-par library --- is that the determinism of the system hinges on some notion of monotonicity. In fact, it's no coincidence that the same principle of monotonicity that CRDTs leverage to ensure eventual consistency in distributed systems can also be put to work in deterministic-by-construction parallel programming.
In this example-driven talk, I'll introduce LVars, which are data structures that enable deterministic parallel programming. LVars generalize the single-assignment variables often found in deterministic parallel languages to allow multiple assignments that are monotonically increasing with respect to a user-specified lattice of states. LVars maintain determinism by allowing only monotonic writes and "threshold" reads to and from shared data. We'll look at examples of programming in an LVar-based parallel language that is provably deterministic, and we'll explore the connection between LVars and CRDTs.
Lindsey Kuper is a Ph.D. candidate in the Programming Languages Group at Indiana University, where she studies the foundations of deterministic parallel programming. She's a recidivist Mozilla Research intern and contributor to the Rust programming language, and a summer 2013 Hacker School resident. She blogs at composition.al.
While the rapid adoption of distributed systems methodologies to cope with modern business needs has led to a crystallization of "best practices" that could be described as ad hoc at best, academics and practitioners, eager to find elegant solutions to difficult problems, are collaborating in new, exciting ways. What gets lost in the shuffle of commerce and progress, however, are the lessons of the past. What are these lessons? What are we doomed to repeat, and what new ground are we covering? This talk looks at the origins of Distributed Systems in Computer Science academia and practice with the hope of pointing out promising areas of research for today's developers and theorists.
Michael Bernstein is a Brooklyn based software developer, writer, and food and drink fanatic. He is currently a Software Developer at Paperless Post, a New York City based startup, and blogs at michaelrbernste.in.
Existing dataflow systems - such as MapReduce and Dryad - have revolutionized large-scale data processing, but their batch model of computation introduces long delays between the ingestion of new data and its appearance in results. Recently there has been rapid growth in data sources, such as social networks, where the data are more valuable if they can be processed sooner. In this talk, I will present the design and implementation of Naiad, which supports the timely processing of such data, combining the low latency of a stream processor with the rich programmability of a dataflow system. Naiad can perform complex graph analyses on the Twitter firehose in real time with sub-second delays. I will demonstrate how it can also be used to perform interactive data exploration and visualization on large-scale data sets.
Naiad is joint work with Frank McSherry, Rebecca Isaacs, Michael Isard, Martín Abadi, and Paul Barham.
Derek Murray is a Researcher at Microsoft Research Silicon Valley. Before escaping to California, he completed a PhD at the University of Cambridge on the expressive power of distributed execution engines. His aim in life is to make it easier for non-experts to write programs that scale efficiently, and he'll happily work at any level of the stack to achieve this.
Riak is extremely fast as a key-value store, but querying on secondary indexes or running MapReduce jobs can result in unpredictable latency. In practice, developers often require richer means of querying data in real-time. Yuki is an OCaml library that implements various functional data structures in Riak, giving users the ability to interact with their data as if it were a queue, a heap, a random access list or a custom data structure. Yuki has been used in practice to achieve extremely low-latency random access, flexible paging, and conditional streaming of Riak data.
Ryland Degnan is a Senior Software Engineer at Netflix. Previously he worked at Skydeck, a Silicon Valley startup using OCaml as the backend language for a mobile photo-sharing application, and Codian, a UK videoconferencing startup based on massively parallel processor architectures.
StackMob offers customers the ability to create their own API endpoints by uploading code, which StackMob runs. Managing and running arbitrary code for multiple customers alongside existing services can be difficult, and scaling it can lead to serious headaches. We've worked to solve both issues using Riak Core as a routing and management layer.
Riak Core has generally been used for data storage applications, taking after Riak, Riak KV, and Riak CS. Instead of using Riak Core to manage data, we use it to manage separate computing instances. We manage capacity for our customers using Riak Core's built-in abstractions. This opens up new possibilities for using Riak Core to build distributed, service-oriented platforms.
Riak Core provides a sophisticated platform for managing separate instances in a multitenant environment, and is well suited to managing a distributed, service-oriented system. I'll be talking about our particular system and how it takes advantage of the Dynamo model, and more generally about using Riak Core to manage applications rather than data.
Miles is a software engineer at StackMob, where he works on the backend team building distributed systems in Scala and Erlang. Prior to this he studied Computer Science at Claremont McKenna College.
You've done it: you've convinced your manager that the one thing that stands between his/her life now and a life in lucrative early retirement is a distributed database. You've provisioned the hardware, leased the colo space, and signed the bandwidth contracts. You've installed and configured the software, trained the staff, migrated the codebase, and pushed the Launch button.
But something is wrong. What's broken? How do you fix it?
We'll take you on a walk through how Basho's Client Services team addresses these sorts of issues and helps to set your team back on the road to profitability.
Justin Shoffstall started with Basho in November, 2011 following a stint in developing custom software in Java, JavaScript, and Python. Before that, he worked as a Systems and Network Administrator for various ISPs, banks, and manufacturing facilities. At night he solves crimes, defends the downtrodden, and plays bass guitar.
Amazon's Dynamo paper has turned out to be one of the main catalysts for the NoSQL movement, with two of it's most famous implementations being Riak and Apache Cassandra. In this talk, a committer on the Apache Cassandra project branches out from Java-land to explore the Riak view of the Dynamo landscape, and will compare and contrast the two systems. Further, as I work for one of the largest Apache Cassandra users on the internet, I'll hypothesize how Riak could be used at Netflix from both a data modeling standpoint as well as operating in a cloud-based environment.
Jason is a Senior Software Engineer at Netflix, where he's been for the last five years. He is also a a committer on the Apache Cassandra project, and interested in all types of distributed systems/databases. Jason holds a Master’s degree in Music Composition and is searching for time to write a second opera.
Until now, Riak has not known or cared about what values that users have been storing inside it. While this allows the store to be used for almost anything, it also leaves our users with the task of designing their own data structures, serialization and, in lots of cases, sibling merge functions.
For the last year a small engineering group inside Basho has been working on a set of built-in Data Types, in a bid to make Riak easier to use.
This talk will cover the group’s progress on CRDTs since last year, including our current work, and will show developers how to use them in their own applications.
Sam is an Computer Science Student at the University of St Andrews and Basho’s first Engineering Intern. He’s interested in Distributed Systems, Concurrency and Parallelism.
You've done it: you've convinced your manager that the one thing that stands between his/her life now and a life in lucrative early retirement is a distributed database. You've provisioned the hardware, leased the colo space, and signed the bandwidth contracts. You've installed and configured the software, trained the staff, migrated the codebase, and pushed the Launch button.
But something is wrong. What's broken? How do you fix it?
We'll take you on a walk through how Basho's Client Services team addresses these sorts of issues and helps to set your team back on the road to profitability.
Charlie Voiselle is a Client Services Engineer with Basho Technologies. When he's not providing bad-ass rockstar support to his users, he's probably bowling. He's easy to spot in a crowd due to his hair, and you'll probably never see it the same color twice. Although his handle is @angrycub, he's really a big ol' teddy bear that loves to talk about supporting distributed systems.
Talk Details TBA
After dropping out of University College Dublin in the early 2000s, Andrew did some freelance web design until he somehow landed a job in an area he knew nothing about: VOIP telephony and automated call distribution for call centers. He went on to develop two call center platforms, the second one, open-sourced and called OpenACD, in Erlang (which he learned on the fly). Along the way he wrote an Erlang interface module for the FreeSWITCH VOIP server and the popular gen_smtp library for Erlang. Following that, he went to work at Barracuda networks for a year doing JavaScript UI development for their Cudatel VOIP product. Following a 2011 writeup of optimizing GitHub's Erlang git daemon (which they abandoned because it didn't scale) he was offered a job at Basho, which he has enjoyed ever since.
At Basho, Andrew has been responsible for maintaining the EDS Replication product as well as designing and implementing Lager, the popular Erlang logging library used in Riak since 1.0. Of late he's been spending a lot of time working on Riak's forthcoming security framework. In his rare free time, he enjoys working on his cars, tinkering with old UNIX hardware and hacking on pointless side projects like re-implementing ancient DOS video games and trying to implement an IMAP server in Erlang.
Riak excels at one type of query: puts and gets. But the world demands more from a database. Since Basho isn't primarily a search company, we decided to leverage the power of Solr for Riak 2.0. This is a walkthrough of what new features we get, how it's an improvement, and why you'd want to use it. Also, of course, demos.
Eric is an engineer at Basho, who also happens to work on Yokozuna. He's also founded a company or two, has written a few books, but most importantly of all is a Google Glass Explorer.
Running background units of work (or jobs) such as generating reports from queries, sending invoices, or calculating risk metrics that are triggered by events or based on pre-determined schedules, is a requirement of many software applications.
Riak Pipe is a layer that sits on top of Riak Core which can help application developers build Riak-style decentralized systems that need to distribute units of work around a cluster. Riak Pipe can provide backpressure to upstream clients, facilitate paralell split-join processing across multiple nodes, take advantage of data locality by design, remove common single points of failure, and more.
In this talk Susan will explain some of the common problems of building background job systems, some existing common failover solutions, their pain points, and how a toolkit like Riak Pipe can offer application developers a foundation to building a more dynamic solution that is operationally scalable, can more evenly distribute work, and has more predictable behavior at peak loads.
Susan is currently a Lead Software Engineer who has worked in systems and infrastructure at large scale SaaS companies such as Salesforce.com. She has worked in application development at startups in finance and media/publishing, and developed software for front- and middle- office systems at trading firms before that. When not on her laptop, Susan can be found cycling around Champaign-Urbana or attempting to reason with her toddler, the latter usually unsuccessfully.
According to the CAP theorem, a database that guarantees consistency must sometimes fail valid requests. Likewise, a database that guarantees availability must sometimes sacrifice consistency. Thus, databases of today are opinionated creatures. They pick a side. And thus, so must you, the user. You either pick a side, or you use multiple databases and live with the increased operational complexity.
This talk presents an alternative: on-going work to extend Riak to fully embrace CAP in it's entirety, providing both AP and CP semantics. Allowing you, the user, the option of choosing on a per-bucket basis if you favor absolute consistency or absolute availability. Both options makes sense in different scenarios, why pick a side?
This talk is an update to last year's RICON West talk about adding strongly consistent operations (single key atomic updates) to Riak. While last year's talk discussed the challenges, motivations, and high level plans of bringing consistency to Riak, this talk will present the actual implementation that has since been built. If you're a fan of consensus algorithms, a CAP aficionado, or someone who simply needs a little consistency in your life, then this talk is for you. Finally, this talk will hopefully shed some light on exactly how far off this work is from landing in Riak.
Joseph Blomstedt is a senior engineer at Basho where he has spent the last 2.5 years working on Riak. Joe's contributions include the new clustering subsystem in Riak 1.0, the addition of active anti-entropy in Riak 1.3, and on-going work on adding strong consistency to Riak. Joe is also a PhD candidate at the University of Colorado, researching dataflow programming and heterogeneous CPU/GPU systems. Joe works from home in Seattle, where he spends his free time enjoying all the Pacific Northwest has to offer: great outdoors, great beer, and great coffee.
Transactions, locking, and write serialization are the conventional means for establishing consistency guarantees in distributed systems. As usage of eventual consistency and read repair has escalated, due in part to new database solutions becoming more readily available, CRDTs (commutative replicated data types) have emerged as an elegant solution to enforce eventual consistency optimistically. Real world data however, rarely resembles a canonical CRDT, or even an ensemble of them. In addition, CRDTs can be unintuitive to developers used to thinking in the old model. This talk aims to give practical real world solutions to leveraging CRDT concepts for an industrial application via case study. In particular, this talk will give suggestions on how to tackle data operations that can't always commute. Finally, this talk suggests possible future directions for how more resolution can occur AOT at the database layer beyond typical CRDT merge operations.
Jeremy, at the time of writing this bio, is in the process of forging a startup with some great individuals. His past experience includes architecting and shipping large scale distributed systems in the video game industry. Jeremy is a self-taught programmer, coming from a background in math and physics and is always interested in finding ways to apply math and physics in the context of computer science. He is passionate about distributed systems, physically based rendering, and game engine architecture.
The Weather Company boasts the best historical weather data in business; terabytes of it. The data, combined with the requirements of manageability, scalability and usability, posed a problem when TWC wanted to effectively store and distribute it. The first part of the talk will focus on that problem, its challenges, and Riak as its solution. The second part of the talk will focus on building the next generation on-demand forecasting system using Riak.
Raja Selvaraj is the manager of data systems engineering team at The Weather Company. He currently leads the data systems team (Oracle, SQL Server, MySQL, Riak, Redis, Hadoop, MongoDB) and the systems engineering team which is responsible for building the next generation weather data distribution platform. He has a bachelors and masters degree in computer science.
The Weather Company boasts the best historical weather data in business; terabytes of it. The data, combined with the requirements of manageability, scalability and usability, posed a problem when TWC wanted to effectively store and distribute it. The first part of the talk will focus on that problem, its challenges, and Riak as its solution. The second part of the talk will focus on building the next generation on-demand forecasting system using Riak.
Arvinda Gillella is the architect of the SUN platform at The Weather Company. He currently leads the team responsible for building next generation RESTful data services on big data platform. Prior to joining the Weather Company, he worked at Oracle and Sun Microsystems developing large scale ERP applications.
Users are familiar with how Riak stores their data, but what about its own internal information like node capabilities and ownership details? That data is stored in one of Riak's internal distributed data stores, of course!
This talk will cover one of these stores: Cluster Metadata. Cluster Metadata is an internal, fully-replicated, eventually consistent DHT that will be included in the next release of Riak. The talk will cover, briefly, how this store is used, why it was built, and what improvements it will bring. The primary focus will be covering its design and implementation through some formal, and not so formal, models of its replication and anti-entropy protocols.
Jordan is an engineer at Basho, where he has spent the past year working on the core components of Riak. Prior to working at Basho, he was a Lead Architect at StackMob, built social games for News Corp., and worked on content and supply-chain management systems for companies such as True Value Hardware, International Truck & Engine Corp., and TOMS Shoes. When not heads down in an emacs buffer you will usually find him out with his two pit-bulls (whose pictures tend to dominate the stream of @_jrwest).
One of the common solutions to poorly performance on RDBMs' is to denormalize complex table relationships to make the SQL simpler via reduced table joins. State Farm's recent cloud Platform As A Service initiative brought Riak into our toolbox for data solutions. We looked at some of our poorly performing RDBMS solutions and realized that if you can denormalize, you can probably just persist the data in a easily consumable format in Riak. We'll discuss some of the solutions we are looking to move to Riak, the methodology used to structure the data, and possible code patterns/benefits.
Richard’s career has focused for the last 8 years of an illustrious 8 year career on accessing data. He provided services for online and batch applications as well as trouble shooting issues with other developers data problems. He is now a lead in the Data Architecture space for the core business products that State Farm provides. His three children take up most of his spare time, but on the occasion he enjoys tinkering on personal application projects with friends and video games.
Richard’s career started out working on data intensive applications in the rating space for State Farms Fire Company. He then joined the Data Access team that helps provide guidance across the core business products; providing guidance on how to design and access data as well as coding high volume data web services. In his personal time he loves learning about new programming languages, and reading books on computer science. 7 databases in 7 weeks is a personal favorite.
TBA
James Hughes is a Principal Technologist at Seagate Technology. Formerly with Huawei, and Sun Microsystems where he was a Sun Fellow, VP and the Solaris Chief Technologist. James is a recognized expert in the area of storage, networking, and information security. Before Sun, James worked at StorageTek, Network Systems, and Control Data Corp. He has over 40 years experience in OS, storage, networking, information security, and cryptography and is the holder of 30 patents with many more pending.