Tracking Business Metrics (or Fun with Custom Metrics) 

Most Stackdriver users work in technical jobs.  Those of us in such positions tend to think about problems from the point of view of technology.  While Stackdriver’s support for custom metrics is a great way to supplement application monitoring with additional technical technical measurements (like the number of alerts sent to users or the failure […]

When Eventual Consistency is Really Eventual 

Amazon’s S3 storage services promises extreme durability (11 9s) but only eventual consistency.  Applications must be designed to accommodate this relaxed consistency model.  Because consistency is often achieved within just seconds, it is easy to be lulled into habits that fail when the latency to consistency is much higher.  Can your system handle very high […]

AWS C3 Instance Adoption Follow-Up 

During re:Invent in November, AWS announced the new C3 compute-optimized instance class. A few weeks after the announcement, we shared on this blog some data showing the adoption rate of the new C3 instances. Now that a little more time has passed, we wanted to check in again with some data about the adoption of […]

Adoption of Amazon’s new EC2 C3-class instances 

Adoption of EC2 C3 instances by day since launch

It has been about three weeks since Amazon announced the new C3 instance class at re:Invent. The introduction of a new instance class gives us the rare opportunity to observe the adoption of new offerings in AWS. The chart below shows the percentage of C3 instance types monitored by Stackdriver starting the day before the announcement to […]

Announcing Subgroups for Organizing Resources 

Stackdriver Intelligent Monitoring now allows you to nest groups to organize monitored resources in a way that better reflects your application architecture. At Stackdriver, we want to monitor and alert on your infrastructure using the same terminology that you use when you think about your application. The “groups” feature allows you to define filters which […]

Report on October 23 Stackdriver Outage 

On Wednesday, October 23, around 9:45am ET, the Stackdriver Intelligent Monitoring application suffered an outage.  We sincerely apologize for this incident.  We know that you depend on Stackdriver to help keep your application running, and when Stackdriver is unavailable, your job becomes more difficult.  We take this occurrence very seriously and will improve the service to reduce […]

Understanding CPU Steal – An Experiment 

Recently at Stackdriver, we wanted to understand the metric of “CPU steal”.  We knew that it was a relatively recent addition to the Unix method of tracking CPU usage, and we knew that it was introduced with the growing use of virtualization.  We wanted to understand the metric better to so that we could provide […]

Teaching an Old Developer New DevOps Tricks – DevOpsDays NYC Presentation 

Last week, I had the privilege of sharing a short ignite talk at DevOps Days NYC (http://devopsdays.org/events/2013-newyork/).  The talk described my experiences of transitioning from an enterprise software engineering role into an organization that operates in a manner more influenced by the DevOps movement.  The talk also includes a call for the DevOps community to […]

Intelligent Monitoring – Noisy Neighbor Detection 

Detect noisy neighbors on your shared AWS box with Stackdriver

Here at Stackdriver, we continue to add new types of analysis to help you better understand and optimize your cloud deployment. Today, we have released a new analysis that informs you when one of your instances has “noisy neighbors”. As you know, the instances provided by cloud providers are virtual machines (guests) running in a […]

Intelligent Monitoring – Trending Towards Resource Exhaustion 

At Stackdriver, we are continuing to create new intelligence features to help you understand the deployment of your app in the public cloud. Today, we are releasing an intelligence feature to combat a silent killer. The exhaustion of a resource–like memory or disk space–can strike at any time and lead to a range of problems. […]