Blogroll: CloudFlare

I read blogs, as well as write one. The 'blogroll' on this site reproduces some posts from some of the people I enjoy reading. There are currently 52 posts from the blog 'CloudFlare.'

Disclaimer: Reproducing an article here need not necessarily imply agreement or endorsement!

Subscribe to CloudFlare feed CloudFlare
The Cloudflare Blog
Updated: 1 hour 43 min ago

Introducing Cloudflare One

Mon, 12/10/2020 - 14:01
Introducing Cloudflare OneIntroducing Cloudflare One

Today we’re announcing Cloudflare One™. It is the culmination of engineering and technical development guided by conversations with thousands of customers about the future of the corporate network. It provides secure, fast, reliable, cost-effective network services, integrated with leading identity management and endpoint security providers.

Over the course of this week, we'll be rolling out the components that enable Cloudflare One, including our WARP Gateway Clients for desktop and mobile, our Access for SaaS solution, our browser isolation product, and our next generation network firewall and intrusion detection system.

The old model of the corporate network has been made obsolete by mobile, SaaS, and the public cloud. The events of 2020 have only accelerated the need for a new model. Zero Trust networking is the future and we are proud to be enabling that future. Having worked on the components of what is Cloudflare One for the last two years, we’re excited to unveil today how they’ve come together into a robust SASE solution and share how customers are already using it to deliver the more secure and productive future of the corporate network.

What Is Cloudflare One? Secure, Optimized Global Networking

Cloudflare One is a comprehensive, cloud-based network-as-a-service solution that is designed to be secure, fast, reliable and define the future of the corporate network. It replaces a patchwork of appliances and WAN technologies with a single network that provides cloud-based security, performance, and control through one user interface.

Cloudflare One brings together how users connect, on ramps for branch offices, secure connectivity for applications, and controlled access to SaaS into a single platform.

Cloudflare One reflects the complex nature of corporate networking today: mobile and remote users, SaaS applications, a mix of applications hosted in private data centers and public cloud, as well the challenge of employees using the broader Internet securely from their corporate and personal devices.

Introducing Cloudflare One

Whether you call this SASE or simply the new reality, today’s enterprise needs flexibility at every layer of the network and application stack. Secure and authenticated access is needed for users wherever they are: at the office, on a mobile device or working from home. Corporate network architectures need to reflect the state of modern computing that requires secure, filtered Internet access to get to SaaS or public cloud, secure application connectivity to protect against hackers and DDoS, and fast, reliable branch and home office access.

And the new corporate network needs to be global. No matter where applications are hosted, or employees reside, connectivity needs to be secure and fast. With Cloudflare’s massive global presence, traffic is secured, routed, and filtered over an optimized backbone that uses real time Internet intelligence to protect against the latest threats and route traffic around bad Internet weather and outages.

However, you’re only as strong as your weakest link. It doesn’t matter how secure your network is if you allow the wrong people access, or your end user’s devices are compromised. That is why we’re incredibly excited to announce that Cloudflare One takes the power of Cloudflare’s network and combines it with best-of-breed identity management and device integrity to create a complete solution that encompasses the entire corporate network of today and tomorrow.

Partner ecosystem: Identity Management

Most organizations already have one or more identity management systems. Rather than requiring them to change, we are integrating with all the major providers. This week we're announcing partnerships with Okta, Ping Identity, and OneLogin. We support nearly all the other leading identity providers including Microsoft Active Directory and Google Workspace, as well as broadly adopted consumer and developer identity platforms like Github, LinkedIn, and Facebook.

Introducing Cloudflare One

Powerfully, Cloudflare One does not require you to standardize on just one identity provider. We see multiple companies that may have one identity provider for full-time employees and another for contractors. Or one they chose themselves and another they inherited from an acquisition. Cloudflare One will integrate with one or more identity providers and allow you to then set consistent policies across all your applications.

The metaphor that makes sense to me is that the identity provider issues passports and Cloudflare One is the border agent that checks that they're valid. At any particular moment, different passports from different providers may be allowed or forbidden to enter just by updating the instructions the border agent follows.

Partner ecosystem: Device Integrity

In addition to identity, device integrity and endpoint security are an important part of a zero trust solution. This week we're announcing partnerships with CrowdStrike, VMware Carbon Black, SentinelOne, and Tanium. These providers run on devices and ensure that they haven't been compromised. Again, organizations can centralize around a single vendor for device integrity or can mix and match with Cloudflare One providing a consistent control plane.

Introducing Cloudflare One

Extending the border control analogy, it's like having a temperature screening and COVID-19 test when you enter a country. Even if you have a valid passport, if you're not healthy then you will be turned away. By partnering with the leading identity and device integrity providers, Cloudflare One provides a robust identity and access management solution that fully delivers on the promise of Zero Trust.

We’re thrilled to partner with these leading identity management and endpoint security companies to make Cloudflare One flexible and robust.

With this as an introduction to Cloudflare One, I wanted to provide some context on why the existing paradigm doesn’t work, what the future of the enterprise network looks like, and where we go from here. In order to understand the power of Cloudflare One, you first have to understand the way we used to build and secure corporate networks and how the transition to mobile, cloud, and remote work have all forced this fundamental change in the paradigm.

The Middle(box) Ages: How Corporate Security Used to Work

The Internet was designed to be a massive, decentralized network. Any computer could connect to that network and route data from one location to another. The model provided resiliency, but did not guarantee fast or available connections. The early Internet also lacked a framework for security.

Introducing Cloudflare One

As a result, enterprises did not trust the Internet as a platform for their businesses. To keep employees productive, network connections had to be fast and available. Those connections also had to be secure. So, businesses built their own shadow versions of the Internet:

  • Companies purchased dedicated, private connections between offices and across their data centers in the form of expensive MPLS links.
  • IT teams managed complex routing across offices, VPN hardware, and clients.
  • Security teams deployed physical firewall boxes and DDoS appliances to keep the private network safe.
  • When employees had to use the Internet, security teams backhauled traffic through a central location to filter outbound connections with yet more hardware: Internet gateways.

Legacy corporate security followed a castle and moat approach. You put all your sensitive applications and data in the castle, you required all your employees to come to work in the castle every day, and then you built a metaphorical moat around the castle using firewalls, DDoS appliances, gateways and more: an unmanageable mess of devices and vendors.

The Middle(box) Ages Are Long Gone

While smarter attackers finding ways to breach moats were always a concern for the castle and moat approach, ultimately they weren’t what caused the approach to fail. Instead the change came from transformation of the technical landscape. Smartphones made workers increasingly mobile, letting them venture outside the moat. SaaS and the public cloud moved data and corporate applications out of the metaphorical castle.

Introducing Cloudflare One

And, in 2020, COVID-19 changed everything by forcing everyone who could to work remotely. If the employees weren't coming to work in the castle anymore, the whole paradigm completely breaks down. This transition was happening already, but this year poured gasoline on the already smoldering fire. Increasingly companies are realizing that the only way forward is to embrace the fact that employees, servers and applications are now “on the Internet” and not “in the castle.” This new paradigm is known as “Zero Trust.”

Google's seminal paper, "BeyondCorp: A New Approach to Enterprise Security," published in 2014, brought the idea of Zero Trust security into the mainstream. Google's insight in 2014 was that you could solve the challenges of every employee and application being on the Internet by ensuring that every application would inherently distrust every connection. If there was zero trust inherent to what network you were on, then every user of every application would be continuously authenticated. Powerfully, that would simultaneously enhance security while enabling more use of cloud applications as well as mobile and remote work.

The Future LAN: A Secure WAN

What we realized talking to customers was that even the analyst and competitor framing of the future corporate network didn't fully recognize some challenges that come with a Zero Trust model. One of the benefits of embracing a Zero Trust model is that it makes enabling branch and home offices easier and less expensive. Rather than having to lease expensive MPLS circuits to connect branch offices — something that is literally impossible as people work from home — you instead require every use of every application to be authenticated.

Introducing Cloudflare One

This lines up with something else we’ve heard from our customers over the last six months: “maybe the Internet is almost good enough.” Like physical offices, many MPLS or SD-WAN deployments are currently sitting idle. And yet, employees continue to be productive. If users could move to a model that runs on the Internet, and one that improves the Internet, teams can stop spending money on legacy routing. Rather than trying to build more private networks, the corporate network of the future leverages the Internet but with heightened security, performance, and reliability.

That sounds great, but it opens a whole new can of worms. Inherently to do this you need to expose more of your applications to the Internet. While they may be safe from unauthorized use if you've properly implemented Zero Trust, that opens them to many less sophisticated, but highly disruptive challenges.

At the end of 2019 we saw a disturbing new trend begin to emerge. DDoS attackers shifted their focus from embarrassing companies by knocking their websites offline to increasingly targeting internal applications and networks. Unfortunately, we've seen more of these attacks launched throughout the pandemic.

It’s not a coincidence. It's the direct result of companies being forced to expose more of their internal applications to the Internet in order to support remote work. To our surprise, it has turned out that while we anticipated Access and Gateway being the natural pairing of products, equally often customers looking to move to a Zero Trust model are bundling Cloudflare's DDoS and WAF products.

It makes sense. If you are exposing more of your applications to the Internet, then the problems that Internet-facing applications have had to deal with in the past now become the problems of your internal applications as well. It's become clear to us that the future of a SASE or Zero Trust network needs to also include DDoS mitigation and WAF as well.

Making the Internet Secure and Reliable Enough for the Enterprise

We agree with the customers we’ve talked to who say that the Internet is almost good enough to replace a corporate network. We’ve been building products to fill in the gaps where it needs to be better. Virtual appliances in regional public cloud providers are not sufficient. Enterprises need a global, distributed network that accelerates traffic in any location.

Introducing Cloudflare One

We’ve spent the last decade building Cloudflare’s network; bringing the Internet closer to users around the world and supporting incredible scale. According to W3Techs, more than 14% of the web already relies on our network. We can also use that to constantly measure the Internet at scale and find faster routes. That scale allows us to deliver Cloudflare One to any organization, no matter where they are located or how global their workforce, and ensure their network and applications are secure, fast, and reliable.

Foreshadowing Cloudflare One

The same lessons we’ve learned handling traffic for the websites on our network can be applied to how enterprises connect to everything else. We started that journey last year when we launched Cloudflare WARP, a consumer product that routes all connections leaving a personal device through Cloudflare’s network, where we can encrypt and accelerate it. This week, we’ll show how the WARP Client is now one of the on-ramps to get employee traffic onto Cloudflare One.

Introducing Cloudflare One

We launched WARP on mobile devices because we knew they would prove to be the most difficult to get right. Traditionally, VPN clients are clunky battery sucks designed for desktops and, if they have mobile versions at all, they’ve been clumsily ported over. We set out to build WARP to work great on mobile, not burning battery life or slowing connections down, because we knew if we could pull that off then it would be easy to port it to the less limited constraints of the desktop.

We also launched it for consumers first because they are the best QA team you could ever assemble. More than 10 million consumers have been putting WARP through its paces for the last year. We’ve seen edge cases from every corner of the Internet and used them to iron the bugs out. We knew that if we could make the WARP Client something that consumers loved to use then it would be a stark contrast to every other enterprise solution in the market.

Meanwhile, we built products to deliver the same improvements to data centers and offices. We announced Magic Transit last year to provide secure, performant, and reliable IP connectivity to the Internet. Earlier this year, we expanded that model when we launched Cloudflare Network Interconnect (CNI) to allow our customers to interconnect branch offices and data centers directly with Cloudflare.

Cloudflare Access starts by introducing identity into Cloudflare’s network. We apply filters based on identity and context to both inbound and outbound connections. Every login, request, and response proxies through Cloudflare’s network regardless of the location of the server or user.

Cloudflare Gateway keeps connections to the rest of the Internet safe. By routing all traffic through Cloudflare’s network first, customers can deprecrate on-premise firewalls eliminating Internet backhaul requirements that slow down users.

Introducing Cloudflare OnePulling the Pieces Together

We think about the products in Cloudflare One in two categories:

  • On-ramps: the products that connect a user, device, or location to Cloudflare’s edge. WARP for endpoints, Magic Transit and CNI for networks, Argo Smart Routing to accelerate traffic.
  • Filters: the products that shield networks from attacks, inspect traffic for threats, and apply least privilege rules to data and applications. Access for Zero Trust rules, Gateway for traffic filtering, Magic Firewall for network filtering.

Most competitors in this space focus on one area, which loses out on the efficiencies of combining them in a single solution. Cloudflare One brings those together on our network. By integrating both sides of the challenge, we can give administrators a single place to manage and secure their network.

Introducing Cloudflare OneWhat Differentiates Cloudflare OneEasy to Deploy, Manage, and Use

We’ve always offered free and pay-as-you-go plans that teams of any size could sign up for with a credit card. Those customers lack the systems integrators or IT departments of large enterprises. To serve those teams, we had to build a control plane and dashboard that was accessible and easy to use.

The products in Cloudflare One follow that same approach; comprehensive enough for enterprises but easy to use to make these products accessible to any team. We’ve also extended that to end users; the client application that powers Gateway is built on what we learned creating Cloudflare WARP for consumer users.

Unified Solution

Cloudflare One puts the entire corporate network behind a single pane of glass. By integrating with leading identity providers and endpoint security solutions, Cloudflare One enables companies to enforce a consistent set of policies across all their applications. Since the network is the common denominator of all applications, by building control into the network Cloudflare One ensures consistent policies whether an application is new or legacy, run on-premise or in the cloud, and delivered from your own infrastructure or a multi-tenant SaaS provider.

Cloudflare One also helps rationalize complicated deployments. While it would be great if every app and every employee and contractor used the same identity provider, for example, that isn’t always possible. Acquisitions, skunkworks projects, and internal disagreements can cause multiple different solutions to be present inside one company. Cloudflare One allows you to plug different providers into one unified network control plane to ensure consistent policies.

Significant ROI

Our core tenet of serving the entire Internet has always forced us to obsess over costs. Efficiency is in the DNA of Cloudflare and we use our efficiency to pass along customer-friendly, fixed-rate pricing. Cloudflare One builds on that experience to deliver a platform that is more cost-effective than combining point solution vendors. The differences are especially apparent versus other providers who have tried to build on top of public cloud platforms and inherit their cost and inconsistent network performance.

To achieve the level of efficiency needed to compete with hardware appliances required us to invent a new type of platform. That platform needed to be built our own network where we could drive costs down and ensure the highest level of performance. It needed to be architected so any server in any city that made up Cloudflare’s network could run every one of our services. That means that Cloudflare One runs across Cloudflare’s global network spanning more than 200 cities worldwide. Even your farthest flung branch offices and remote workers are likely within milliseconds of servers powering Cloudflare One, ensuring our service works well wherever your team works.

Leverages Cloudflare’s Scale

Cloudflare already sits in front of a huge portion of the Internet. That allows us to see and respond to new security threats continuously. It also means that Cloudflare One customers’ traffic can be more efficiently routed, even when going to applications that would appear to be on the public Internet.

For instance, an employee behind Cloudflare One who is catching up on holiday shopping during their lunch break can have their traffic routed from a corporate branch office, across Cloudflare’s Magic Transit, over Cloudflare’s global backbone, across Cloudflare’s Network Interconnect, and to the ecommerce provider. Because Cloudflare handles the packets end-to-end, we can ensure they are encrypted, optimally routed, and efficiently delivered. As more of the Internet uses Cloudflare, the experience of surfing the Internet for Cloudflare One customers will grow even more exceptional.

What Does Cloudflare One Replace?

Instead of expensive MPLS links or complex SD-WAN deployments, Cloudflare One provides two on-ramps to your applications and the entire Internet: WARP and Magic Transit. WARP connects employees from any device, and any location, to Cloudflare’s network. Magic Transit allows broad deployments across whole offices or data centers.

Cloudflare Access replaces private-networks-as-security with Zero Trust controls. Later this week, we’ll announce how you can extend Access to any application, including SaaS applications.

Finally, Cloudflare One eliminates traditional network firewalls and web gateways. Cloudflare Gateway inspects traffic leaving any device in your organization to block threats on the Internet and prevent data from leaving. Magic Firewall will give your networks the same security, filtering traffic at the transport layer to replace the top-of-rack firewalls that block data exfiltration or attacks from unsecure network protocols.

Introducing Cloudflare OneWhat Comes Next?

Your team can start using Cloudflare One today. Add Zero Trust control to your applications with Cloudflare Access and secure DNS queries with Cloudflare Gateway. Keep networks safe from DDoS attacks with Magic Transit and connect your applications through Cloudflare with Argo Tunnel.

Over the course of the week, we’ll be launching new features and products to start to complete this vision. On Tuesday, we’ll extend the Zero Trust security of Cloudflare Access to all of your applications. Starting Wednesday, teams will be able to use Cloudflare WARP to proxy all employee traffic to Cloudflare where Gateway will now secure more than just DNS queries. You’ll be invited to sign up for Cloudflare’s browser isolation beta on Thursday and we’ll wrap the week with new APIs to control how Magic Transit secures your network.

It’s going to be a busy week, but we’re just getting started. Replacing a corporate network should not also mean you lose control over how that network operates. Magic WAN is our solution to complex SD-WAN deployments.

Security for that entire network should also work in both directions. Magic Firewall is our alternative to the clunky “next-generation firewall” appliances that secure outbound traffic. Data loss prevention (DLP) is another space that has lacked innovation and where we plan to extend Cloudflare One.

Introducing Cloudflare One

Finally, you should have visibility into that network. We’ll be launching new tools to detect and mitigate intrusion attempts that happen anywhere on your network, including unauthorized access to any SaaS applications you use. Now that we’ve built the on-ramps onto Cloudflare One, we’re excited to continue to innovate to provide more functionality and control to solve our customers biggest network security, performance, and reliability challenges.

Delivering the Network Customers Need Today

Over the last 10 years, Cloudflare has built one of the fastest, most reliable, most secure networks in the world. We’ve seen the power of using that network internally to enable our own teams to innovate quickly and securely. With the launch of Cloudflare One, we’re extending the power of Cloudflare’s network to meet the challenges of any company. The move to Zero Trust is a paradigm shift but the changes to how we work we believe has made it inevitable for every company. We’re proud of how we’ve been able to help some of Cloudflare One’s first customers reinvent their corporate networks. It makes sense to close with their own words.

Introducing Cloudflare One“JetBlue Travel Products needed a way to give crew-members secure and simple access to internally-managed benefit apps. Cloudflare gave us all that and more — a much more efficient way to connect business partners and crew-members to critical internal tools." — Vitaliy Faida, General Manager, Data/DevSecOps at JetBlue Travel Products.Introducing Cloudflare One“OneTrust relies on Cloudflare to maintain our network perimeter, so we can focus on delivering technology that helps our customers be more trusted. “With Cloudflare, we can easily build context-aware Zero Trust policies for secure access to our developer tools. Employees can connect to the tools they need so simply teams don’t even know Cloudflare is powering the backend. It just works.” — Blake Brannon, CTO of OneTrust.Introducing Cloudflare One"Discord is where the world builds relationships. Cloudflare helps us deliver on that mission, connecting our internal engineering team to the tools they need. With Cloudflare, we can rest easy knowing every request to our critical apps is evaluated for identity and context — a true Zero Trust approach." — Mark Smith, Director of Infrastructure at Discord.Introducing Cloudflare One"When you're a fast-growing, security-focused company like Area 1, anything that slows development down is the enemy. With Cloudflare, we've found a simpler, more secure way to connect our employees to the tools they need to keep us growing - and the experience is lightning-fast." — Blake Darché, CSO at Area 1 Security.Introducing Cloudflare One"We launched quickly in April 2020 to bring remote learning to children throughout the UK during the coronavirus pandemic, Cloudflare Access made it fast and simple to authenticate a huge network of teachers and developers into our production sites and we set it up in literally less than an hour. Cloudflare's WAF helped ensure the security and resilience of our public-facing website from day one." — John Roberts, Technology Director at Oak National Academy.Introducing Cloudflare One“With Cloudflare, we’ve been able to reduce our dependence on VPNs and IP allow-listing for development environments. Our developers and testers aren't required to login from specific locations, and we’ve been able to deploy an SSO solution to simplify the login process. Access is easier to manage than VPNs and other remote access solutions, which has removed pressure from our IT teams. They can focus on internal projects instead of spending time managing remote access.” — Alexandre Papadopoulos, Director of Cyber Security, INSEAD.
Categories: Technology

What is Cloudflare One?

Mon, 12/10/2020 - 14:00
What is Cloudflare One?

Running a secure enterprise network is really difficult. Employees spread all over the world work from home. Applications are run from data centers, hosted in public cloud, and delivered as services. Persistent and motivated attackers exploit any vulnerability.

Enterprises used to build networks that resembled a castle-and-moat. The walls and moat kept attackers out and data in. Team members entered over a drawbridge and tended to stay inside the walls. Trust folks on the inside of the castle to do the right thing, and deploy whatever you need in the relative tranquility of your secure network perimeter.

The Internet, SaaS, and “the cloud” threw a wrench in that plan. Today, more of the workloads in a modern enterprise run outside the castle than inside. So why are enterprises still spending money building more complicated and more ineffective moats?

Today, we’re excited to share Cloudflare One™, our vision to tackle the intractable job of corporate security and networking.

What is Cloudflare One?

Cloudflare One combines networking products that enable employees to do their best work, no matter where they are, with consistent security controls deployed globally.

Starting today, you can begin replacing traffic backhauls to security appliances with Cloudflare WARP and Gateway to filter outbound Internet traffic. For your office networks, we plan to bring next-generation firewall capabilities to Magic Transit with Magic Firewall to let you get rid of your top-of-shelf firewall appliances.

With multiple on-ramps to the Internet through Cloudflare, and the elimination of backhauled traffic, we plan to make it simple and cost-effective to manage that routing compared to MPLS and SD-WAN models. Cloudflare Magic WAN will provide a control plane for how your traffic routes through our network.

You can use Cloudflare One today to replace the other function of your VPN: putting users on a private network for access control. Cloudflare Access delivers Zero Trust controls that can replace private network security models. Later this week, we’ll announce how you can extend Access to any application - including SaaS applications. We’ll also preview our browser isolation technology to keep the endpoints that connect to those applications safe from malware.

Finally, the products in Cloudflare One focus on giving your team the logs and tools to both understand and then remediate issues. As part of our Gateway filtering launch this week we’re including logs that provide visibility into the traffic leaving your organization. We’ll be sharing how those logs get smarter later this week with a new Intrusion Detection System that detects and stops intrusion attempts.

What is Cloudflare One?

Many of those components are available today, some new features are arriving this week, and other pieces will be launching soon. All together, we’re excited to share this vision and for the future of the corporate network.

Problems in enterprise networking and security

The demands placed on a corporate network have changed dramatically. IT has gone from a back-office function to mission critical. In parallel with networks becoming more integral, users spread out from offices to work from home. Applications left the datacenter and are now being run out of multiple clouds or are being delivered by vendors directly over the Internet.

Direct network paths became hairpin turns

Employees sitting inside of an office could connect over a private network to applications running in a datacenter nearby. When team members left the office, they could use a VPN to sneak back onto the network from outside the walls. Branch offices hopped on that same network over expensive MPLS links.

When applications left the data center and users left their offices, organizations responded by trying to force that scattered world into the same castle-and-moat model. Companies purchased more VPN licenses and replaced MPLS links with difficult SD-WAN deployments. Networks became more complex in an attempt to mimic an older model of networking when in reality the Internet had become the new corporate network.

Defense-in-depth splintered

Attackers looking to compromise corporate networks have a multitude of tools at their disposal, and may execute surgical malware strikes, throw a volumetric kitchen sink at your network, or any number of things in between. Traditionally, defense against each class of attack was provided by a separate, specialized piece of hardware running in a datacenter.

Security controls used to be relatively easy when every user and every application sat in the same place. When employees left offices and workloads left data centers, the same security controls struggled to follow. Companies deployed a patchwork of point solutions, attempting to rebuild their topside firewall appliances across hybrid and dynamic environments.

High-visibility required high-effort

The move to a patchwork model sacrificed more than just defense-in-depth — companies lost visibility into what was happening in their networks and applications. We hear from customers that this capture and standardization of logs has become one of their biggest hurdles. They purchased expensive data ingestion, analysis, storage, and analytics tools.

Enterprises now rely on multiple point solutions that one of the biggest hurdles is the capture and standardization of logs. Increasing regulatory and compliance pressures place more emphasis on data retention and analysis. Splintered security solutions become a data management nightmare.

Fixing issues relied on best guesses

Without visibility into this new networking model, security teams had to guess at what could go wrong. Organizations who wanted to adopt an “assume breach” model struggled to determine what kind of breach could even occur, so they threw every possible solution at the problem.

We talk to enterprises who purchase new scanning and filtering services, delivered in virtual appliances, for problems they are unsure they have. These teams attempt to remediate every possible event manually, because they lack visibility, rather than targeting specific events and adapting the security model.

How does Cloudflare One fit?

Over the last several years, we’ve been assembling the components of Cloudflare One. We launched individual products to target some of these problems one-at-a-time. We’re excited to share our vision for how they all fit together in Cloudflare One.

Flexible data planes

Cloudflare launched as a reverse proxy. Customers put their Internet-facing properties on our network and their audience connected to those specific destinations through our network. Cloudflare One represents years of launches that allow our network to process any type of traffic flowing in either the “reverse” or “forward” direction.

In 2019, we launched Cloudflare WARP — a mobile application that kept Internet-bound traffic private with an encrypted connection to our network while also making it faster and more reliable. We’re now packaging that same technology into an enterprise version launching this week to connect roaming employees to Cloudflare Gateway.

Your data centers and offices should have the same advantage. We launched Magic Transit last year to secure your networks from IP-layer attacks. Our initial focus with Magic Transit has been delivering best-in-class DDoS mitigation to on-prem networks. DDoS attacks are a persistent thorn in network operators’ sides, and Magic Transit effectively diffuses their sting without forcing performance compromises. That rock-solid DDoS mitigation is the perfect platform on which to build higher level security functions that apply to the same traffic already flowing across our network.

Earlier this year, we expanded that model when we launched Cloudflare Network Interconnect (CNI) to allow our customers to interconnect branch offices and data centers directly with Cloudflare. As part of Cloudflare One, we’ll apply outbound filtering to that same connection.

Cloudflare One should not just help your team move to the Internet as a corporate network, it should be faster than the Internet. Our network is carrier-agnostic, exceptionally well-connected and peered, and delivers the same set of services globally. In each of these on-ramps, we’re adding smarter routing based on our Argo Smart Routing technology, which has been shown to reduce latency by 30% or more in the real-world. Security + Performance, because they’re better together.

A single, unified control plane

When users connect to the Internet from branch offices and devices, they skip the firewall appliances that used to live in headquarters altogether. To keep pace, enterprises need a way to secure traffic that no longer lives entirely within their own network. Cloudflare One applies standard security controls to all traffic - regardless of how that connection starts or where in the network stack it lives.

Cloudflare Access starts by introducing identity into Cloudflare’s network. Teams apply filters based on identity and context to both inbound and outbound connections. Every login, request, and response proxies through Cloudflare’s network regardless of the location of the server or user. The scale of our network and its distribution can filter and log enterprise traffic without compromising performance.

Cloudflare Gateway keeps connections to the rest of the Internet safe. Gateway inspects traffic leaving devices and networks for threats and data loss events that hide inside of connections at the application layer. Launching soon, Gateway will bring that same level of control lower in the stack to the transport layer.

You should have the same level of control over how your networks send traffic. We’re excited to announce Magic Firewall, a next-generation firewall for all traffic leaving your offices and data centers. With Gateway and Magic Firewall, you can build a rule once and run it everywhere, or tailor rules to specific use cases in a single control plane.

We know some attacks can’t be filtered because they launch before filters can be built to stop them. Cloudflare Browser, our isolated browser technology gives your team a bulletproof pane of glass from threats that can evade known filters. Later this week, we’ll invite customers to sign up to join the beta to browse the Internet on Cloudflare’s edge without the risk of code leaping out of the browser to infect an endpoint.

Finally, the PKI infrastructure that secures your network should be modern and simpler to manage. We heard from customers who described certificate management as one of the core problems of moving to a better model of security. Cloudflare works with, not against, modern encryption standards like TLS 1.3. Cloudflare made it easy to add encryption to your sites on the Internet with one click. We’re going to bring that ease-of-management to the network functions you run on Cloudflare One.

One place to get your logs, one location for all of your security analysis

Cloudflare’s network serves 18 million HTTP requests per second on average. We’ve built logging pipelines that make it possible for some of the largest Internet properties in the world to capture and analyze their logs at scale. Cloudflare One builds on that same capability.

Cloudflare Access and Gateway capture every request, inbound or outbound, without any server-side code changes or advanced client-side configuration. Your team can export those logs to the SIEM provider of your choice with our Cloudflare Logpush service - the same pipeline that exports HTTP request events at scale for public sites. Magic Transit expands that logging capability to entire networks and offices to ensure you never lose visibility from any location.

We’re going beyond just logging events. Available today for your websites, Cloudflare Web Analytics converts logs into insights. We plan to keep expanding that visibility into how your network operates, as well. Just as Cloudflare has replaced the “band-aid boxes” that performed disparate network functions and unified them into a cohesive, adaptable edge, we intend to do the same for the fragmented, hard to use, and expensive security analytics ecosystem. More to come on this soon.Smarter, faster remediation

Data and analytics should surface events that a team can remediate. Log systems that lead to one-click fixes can be powerful tools, but we want to make that remediation automatic.

Launching into a closed preview later this week, Cloudflare Intrusion Detection System (IDS) will proactively scan your network for anomalous events and recommend actions or, better yet, take actions for you to remediate problems. We plan to bring that same proactive scanning and remediation approach to Cloudflare Access and Cloudflare Gateway.

Run your network on our globally scaled network

Over 25 million Internet properties rely on Cloudflare’s network to reach their audiences. More than 10% of all websites connect through our reverse proxy, including 16% of the Fortune 1000. Cloudflare accelerates traffic for huge chunks of the Internet by delivering services from datacenters around the world.

We deliver Cloudflare One from those same data centers. And critically, every datacenter we operate delivers the same set of services, whether that is Cloudflare Access, WARP, Magic Transit, or our WAF. As an example, when your employees connect through Cloudflare WARP to one of our data centers, there is a real chance they never have to leave our network or that data center to reach the site or data they need. As a result, their entire Internet experience becomes extraordinarily fast, no matter where they are in the world.

We expect that performance bonus to become even more meaningful as browsing moves to Cloudflare’s edge with Cloudflare Browser. The isolated browsers running in Cloudflare’s data centers can request content that sits just centimeters away. Even further, as more web properties rely on Cloudflare Workers to power their applications, entire workflows can stay inside of a data center within 100 ms of your employees.

What’s next?

While many of these features are available today, we’re going to be launching several new features over the next several days as part of Cloudflare’s Zero Trust week. Stay tuned for announcements each day this week that add new pieces to the Cloudflare One featureset.

What is Cloudflare One?
Categories: Technology

Exploring WebAssembly AI Services on Cloudflare Workers

Fri, 09/10/2020 - 12:00
Exploring WebAssembly AI Services on Cloudflare Workers

This is a guest post by Videet Parekh, Abelardo Lopez-Lagunas, Sek Chai at Latent AI.

Edge networks present a significant opportunity for Artificial Intelligence (AI) performance and applicability. AI technologies already make it possible to run compelling applications like object and voice recognition, navigation, and recommendations.

AI at the edge presents a host of benefits. One is scalability—it is simply impractical to send all data to a centralized cloud. In fact, one study has predicted a global scope of 90 zettabytes generated by billions of IoT devices by 2025. Another is privacy—many users are reluctant to move their personal data to the cloud, whereas data processed at the edge are more ephemeral.

When AI services are distributed away from centralized data centers and closer to the service edge, it becomes possible to enhance the overall application speed without moving data unnecessarily.  However, there are still challenges to make AI from the deep-cloud run efficiently on edge hardware. Here, we use the term deep-cloud to refer to highly centralized, massively-sized data centers. Deploying edge AI services can be hard because AI is both computational and memory bandwidth intensive. We need to tune the AI models so the computational latency and bandwidth can be radically reduced for the edge.

The Case for Distributed AI Services

Edge network infrastructure for distributed AI is already widely available. Edge networks like Cloudflare serve a significant proportion of today’s Internet traffic, and can serve as the bridge between devices and the centralized cloud. Highly-performant AI services are possible because of the distributed processing that has excellent spatial proximity to the edge data.

We at Latent AI are exploring ways to deploy AI at the edge, with technology that transforms and compresses AI models for the edge. The size of our edge AI model is many orders of magnitudes smaller than the sensor data (e.g., kilobytes or megabytes for the edge AI model, compared to petabytes of edge data). We are exploring using WebAssembly (WASM) within the Cloudflare Workers environment. We want to identify possible operating points for the distributed AI services by exploring achievable performance on the available edge infrastructure.

Architectural Approach for Exploration

WebAssembly (WASM) is a new open-standard format for programs that run on the Web. It is a popular way to enable high-performance web-based applications. WASM is closer to machine code, and thus faster than JavaScript (JS) or JIT. Compiler optimizations, already done ahead of time, reduce the overhead in fetching and parsing application code. Today, WASM offers the flexibility and portability of JS at the near-optimum performance of compiled machine code.

AI models have notoriously large memory usage demands because configuring them requires high parameter counts. Cloudflare already extends support for WASM using their Wrangler CLI, and we chose to use it for our exploration. Wrangler is the open-source CLI tool used to manage Workers, and is designed to enable a smooth developer experience.

How Latent AI Accelerates Distributed AI Services

Latent AI’s mission is to enable ambient computing, regardless of any resource constraints. We develop developer tools that greatly reduce the computing resources needed to process AI on the edge while being completely hardware-agnostic.

Latent AI’s tools significantly compress AI models to reduce their memory size. We have shown up to 10x compression for state-of-the-art models. This capability addresses the load time latencies challenging many edge network deployments. We also offer an optimized runtime that executes a neural network natively. Results are a 2-3x speedup on runtime without any hardware-specific accelerators. This dramatic performance boost offers fast and efficient inferences for the edge.

Our compression uses quantization algorithms to convert parameters for the AI model from 32-bit floating-point toward 16-bit or 8-bit models, with minimal loss of accuracy. The key benefit of moving to lower bit-precision is the higher power efficiency with less storage needed.  Now AI inference can be processed using more efficient parallel processor hardware on the continuum of platforms at the distributed edge.

Exploring WebAssembly AI Services on Cloudflare WorkersOptimized AI services can process data closest to the source and perform inferences at the distributed edge.Selecting Real-World WASM Neural Network Examples

For our exploration, we use state-of-the-art deep neural networks called MobileNet. MobileNets are designed specifically for embedded platforms such as smartphones, and can achieve high recognition accuracy in visual object detection. We compress MobileNets AI models to be small fast, in order to represent the variety of use cases that can be deployed as distributed AI services. Please see this blog for more details on the AI model architecture.

We use the MobileNetV2 model variant for our exploration. The models are trained with different visual objects that can be detected: (1) a larger sized model with 10 objects derived from ImageNet dataset, and (2) a smaller version with just two classes derived from the COCO dataset. The COCO dataset are public open-source databases of images that are used as benchmarks for AI models. Images are labeled with detected objects such as persons, vehicles, bicycles, traffic lights, etc. Using Latent AI’s compression tool, we were able to compress and compile the MobileNetV2 models into WASM programs. In the WASM form, we can achieve fast and efficient processing of the AI model with a small storage footprint.

We want WASM neural networks to be as fast and efficient as possible. We spun up a Workers app to accept an image from a client, convert and preprocess the image into a cleaned data array, run it through the model and then return a class for that image. For both the large and small MobileNetv2 models, we create three variants with different bit-precision (32-bit floating point, 16-bit integer, and 8-bit integer).  The average memory and inference times for the large AI model are 110ms and 189ms, respectively; And for the smaller AI model, they are 159ms and 15ms, respectively.

Our analysis suggests that overall processing can be improved by reducing the overhead for memory operations. For the large model, lowering bit precision to 8-bits reduces memory operations from 48% to 26%. For the small model, the memory load times dominate over the inference computation with over 90% of the latency in memory operations.

It is important to note that our results are based on our initial exploration, which is focused more on functionality rather than optimization. We make sure the results are consistent by averaging our measurements over 50-100 iterations. We do acknowledge that there are still network and system related latencies that can be further optimized, but we believe that the early results described here show promise with respect to AI model inferences on the distributed edge.  

Exploring WebAssembly AI Services on Cloudflare WorkersComparison of memory and inference processing times for large and small DNNs.Learning from Real-World WASM Neural Network Example

What lessons can we draw from our example use case?

First of all, we recommend a minimal compute and memory footprint for AI models deployed to the network edge. A small footprint allows for better line up of data types for WASM AI models to reduce memory load overhead. WASM practitioners know that WASM speed-ups come from the tighter coupling of the API between JavaScript API and native machine code. Because WASM code does not need to speculate on data types, parallelizing compilation for WASM can achieve better optimization.

Furthermore, we encourage the use of running AI models at 8-bit precision to reduce the overall size. These 8-bit AI models are readily compressed and compiled for the target hardware to greatly reduce the overhead in hosting the models for inference. Furthermore, for video imagery, there is no overhead to convert digitized raw data (e.g. image files digitized and stored as integers) to floating-point values for use with floating point AI models.

Finally, we suggest the use of a smart cache for AI models so that Workers can essentially reduce memory load times and focus solely on neural network inferences at runtime. Again, 8-bit models allow more AI models to be hosted and ready for inference. Referring to our exploratory results, hosted small AI models can be served at approximately 15ms inference time, offering a very compelling user experience with low latency and local processing. The WASM API provides a significant performance increase over pure-JS toolchains like Tensorflow.js. For example, for inference time for the large AI model of 189ms on WASM, we have observed a range of 1500ms on Tensorflow.js workflow, which is approximately an 8X difference in compute latency.

Unlocking the Future of the Distributed Edge

With exceedingly optimized WASM neural networks, distributed edge networks can move the inference closer to users, offering new edge AI services closer to the source of the data. With Latent AI technology to compress and compile WASM neural networks, the distributed edge networks can (1) host more models, (2) offer lower latency responses, and (3) potentially lower power utilization with more efficient computing.

Exploring WebAssembly AI Services on Cloudflare WorkersExample person detected using a small AI model, 10x compressed to 150KB.

Imagine for example that the small AI model described earlier can distinguish if a person is in a video feed. Digital systems, e.g. door bell and doorway entry cameras, can hook up to Cloudflare Workers to verify if a person is present in the camera field of view. Similarly, other AI services could conduct sound analyses to check for broken windows  and water leaks. With these distributed AI services, applications can run without access to deep cloud services. Furthermore, the sensor platform can be made with ultra low cost, low power hardware, in very compact form factors.

Application developers can now offer AI services with neural networks trained, compressed, and compiled natively as a WASM neural network. Latent AI developer tools can compress WASM neural networks and provide WASM runtimes offering blazingly fast inferences for the device and infrastructure edge.  With scale and speed baked in, developers can easily create high-performance experiences for their users, wherever they are, at any scale. More importantly, we can scale enterprise applications on the edge, while offering the desired return on investments using edge networks.

About Latent AI

Latent AI is an early-stage venture spinout of SRI International. Our mission is to enable developers and change the way we think about building AI for the edge. We develop software tools designed to help companies add AI to edge devices and to empower users with new smart IoT applications. For more information about the availability of LEIP SDK, please feel free to contact us at info@latentai.com or check out our website.

Categories: Technology

Let's build a Cloudflare Worker with WebAssembly and Haskell

Tue, 06/10/2020 - 12:00
Let's build a Cloudflare Worker with WebAssembly and Haskell

This is a guest post by Cristhian Motoche of Stack Builders.

At Stack Builders, we believe that Haskell’s system of expressive static types offers many benefits to the software industry and the world-wide community that depends on our services. In order to fully realize these benefits, it is necessary to have proper training and access to an ecosystem that allows for reliable deployment of services. In exploring the tools that help us run our systems based on Haskell, our developer Cristhian Motoche has created a tutorial that shows how to compile Haskell to WebAssembly using Asterius for deployment on Cloudflare.


What is a Cloudflare Worker?

Cloudflare Workers is a serverless platform that allows us to run our code on the edge of the Cloudflare infrastructure. It's built on Google V8, so it’s possible to write functionalities in JavaScript or any other language that targets WebAssembly.

WebAssembly is a portable binary instruction format that can be executed fast in a memory-safe sandboxed environment. For this reason, it’s especially useful for tasks that need to perform resource-demanding and self-contained operations.

Why use Haskell to target WebAssembly?

Haskell is a pure functional languages that can target WebAssembly. As such, It helps developers break down complex tasks into small functions that can later be composed to do complex tasks. Additionally, it’s statically typed and has type inference, so it will complain if there are type errors at compile time. Because of that and much more, Haskell is a good source language for targeting WebAssembly.

From Haskell to WebAssembly

We’ll use Asterius to target WebAssembly from Haskell. It’s a well documented tool that is updated often and supports a lot of Haskell features.

First, as suggested in the documentation, we’ll use podman to pull the Asterius prebuilt container from Docker hub. In this tutorial, we will use Asterius version 200617, which works with GHC 8.8.

podman run -it --rm -v $(pwd):/workspace -w /workspace terrorjack/asterius:200617

Now we’ll create a Haskell module called fact.hs file that will export a pure function:

module Factorial (fact) where fact :: Int -> Int fact n = go n 1 where go 0 acc = acc go n acc = go (n - 1) (n*acc) foreign export javascript "fact" fact :: Int -> Int

In this module, we define a pure function called fact, optimized with tail recursion and exported using the Asterius JavaScript FFI, so that it can be called when a WebAssembly module is instantiated in JavaScript.

Next, we’ll create a JavaScript file called fact_node.mjs that contains the following code:

import * as rts from "./rts.mjs"; import module from "./fact.wasm.mjs"; import req from "./fact.req.mjs"; async function handleModule(m) { const i = await rts.newAsteriusInstance(Object.assign(req, {module: m})); const result = await i.exports.fact(5); console.log(result); } module.then(handleModule);

This code imports rts.mjs (common runtime), WebAssembly loaders, and the required parameters for the Asterius instance. It creates a new Asterius instance, it calls the exported function fact with the input 5, and prints out the result.

You probably have noted that fact is done asynchronously. This happens with any exported function by Asterius, even if it’s a pure function.

Next, we’ll compile this code using the Asterius command line interface (CLI) ahc-link, and we’ll run the JavaScript code in Node:

ahc-link \ --input-hs fact.hs \ --no-main \ --export-function=fact \ --run \ --input-mjs fact_node.mjs \ --output-dir=node

This command takes fact.hs as a Haskell input file, specifies that no main function is exported, and exports the fact function. Additionally, it takes fact_node.mjs as the entry JavaScript file that replaces the generated file by default, and it places the generated code in a directory called node.

Running the ahc-link command from above will print the following output in the console:

[INFO] Compiling fact.hs to WebAssembly ... [INFO] Running node/fact.mjs 120

As you can see, the result is executed in node and it prints out the result of fact in the console.

Push your code to Cloudflare Workers

Now we’ll set everything up for deploying our code to Cloudflare Workers.

First, let’s add a metadata.json file with the following content:

{ "body_part": "script", "bindings": [ { "type": "wasm_module", "name": "WASM", "part": "wasm" } ] }

This file is needed to specify the wasm_module binding. The name value corresponds to the global variable to access the WebAssembly module from your Worker code. In our example, it’s going to have the name WASM.

Our next step is to define the main point of the Workers script.

import * as rts from "./rts.mjs"; import fact from "./fact.req.mjs"; async function handleFact(param) { const i = await rts.newAsteriusInstance(Object.assign(fact, { module: WASM })); return await i.exports.fact(param); } async function handleRequest(req) { if (req.method == "POST") { const data = await req.formData(); const param = parseInt(data.get("param")); if (param) { const resp = await handleFact(param); return new Response(resp, {status: 200}); } else { return new Response( "Expecting 'param' in request to be an integer", {status: 400}, ); } } return new Response("Method not allowed", {status: 405}); } addEventListener("fetch", event => { event.respondWith(handleRequest(event.request)) })

There are a few interesting things to point out in this code:

  1. We import rts.mjs and fact.req.mjs to load the exported functions from our WebAssembly module.
  2. handleFact is an asynchronous function that creates an instance of Asterius with the global WASM module, as a Workers global variable, and calls the exported function fact with some input.
  3. handleRequest handles the request of the Worker. It expects a POST request, with a parameter called param in the request body. If param is a number, it calls handleFact to respond with the result of fact.
  4. Using the Service Workers API, we listen to the fetch event that will respond with the result of handleRequest.

We need to build and bundle our code in a single JavaScript file, because Workers only accepts one script per worker. Fortunately, Asterius comes with Parcel.js, which will bundle all the necessary code in a single JavaScript file.

ahc-link \ --input-hs fact.hs \ --no-main \ --export-function=fact \ --input-mjs fact_cfw.mjs \ --bundle \ --browser \ --output-dir worker

ahc-link will generate some files inside a directory called worker. For our Workers, we’re only interested in the JavaScript file (fact.js) and the WebAssembly module (fact.wasm). Now, it’s time to submit both of them to Workers. We can do this with the provided REST API.

Make sure you have an account id ($CF_ACCOUNT_ID), a name for your script ($SCRIPT_NAME), and an API Token ($CF_API_TOKEN):

cd worker curl -X PUT "https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/workers/scripts/$SCRIPT_NAME" \ -H "Authorization: Bearer $CF_API_TOKEN" \ -F "metadata=@metadata.json;type=application/json" \ -F "script=@fact.js;type=application/javascript" \ -F "wasm=@fact.wasm;type=application/wasm"

Now, visit the Workers UI, where you can use the editor to view, edit, and test the script. Also, you can enable it to on a workers.dev subdomain ($CFW_SUBDOMAIN); in that case, you could then simply:

curl -X POST $CFW_SUBDOMAIN \ -H 'Content-Type: application/x-www-form-urlencoded' \ --data 'param=5' Beyond a simple Haskell file

So far, we’ve created a WebAssembly module that exports a pure Haskell function we ran in Workers. However, we can also create and build a Cabal project using Asterius ahc-cabal CLI, and then use ahc-dist to compile it to WebAssembly.

First, let’s create the project:

ahc-cabal init -m -p cabal-cfw-example

Then, let’s add some dependencies to our cabal project. The cabal file will look like this:

cabal-version: 2.4 name: cabal-cfw-example version: 0.1.0.0 license: NONE executable cabal-cfw-example ghc-options: -optl--export-function=handleReq main-is: Main.hs build-depends: base, bytestring, aeson >=1.5 && < 1.6, text default-language: Haskell2010

It’s a simple cabal file, except for the -optl--export-function=handleReq ghc flag. This is necessary when exporting a function from a cabal project.

In this example, we’ll define a simple User record, and we’ll define its instance automatically using Template Haskell!

{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TemplateHaskell #-} module Main where import Asterius.Types import Control.Monad import Data.Aeson hiding (Object) import qualified Data.Aeson as A import Data.Aeson.TH import qualified Data.ByteString.Lazy.Char8 as B8 import Data.Text main :: IO () main = putStrLn "CFW Cabal" data User = User { name :: Text , age :: Int } $(deriveJSON defaultOptions 'User)

NOTE: It’s not necessary to create a Cabal project for this example, because the prebuilt container comes with a lot of prebuilt packages (aesona included). Nevertheless, it will help us show the potential of ahc-cabal and ahc-dist.

Next, we’ll define handleReq, which we’ll export using JavaScript FFI just like we did before.

handleReq :: JSString -> JSString -> IO JSObject handleReq method rawBody = case fromJSString method of "POST" -> let eitherUser :: Either String User eitherUser = eitherDecode (B8.pack $ fromJSString rawBody) in case eitherUser of Right _ -> js_new_response (toJSString "Success!") 200 Left err -> js_new_response (toJSString err) 400 _ -> js_new_response (toJSString "Not a valid method") 405 foreign export javascript "handleReq" handleReq :: JSString -> JSString -> IO JSObject foreign import javascript "new Response($1, {\"status\": $2})" js_new_response :: JSString -> Int -> IO JSObject

This time, we define js_new_response, a Haskell function that creates a JavaScript object, to create a Response. handleReq takes two string parameters from JavaScript and it uses them to prepare a response.

Now let’s build and install the binary in the current directory:

ahc-cabal new-install --installdir . --overwrite-policy=always

This will generate a binary for our executable, called cabal-cfw-example. We’re going to use ahc-dist to take that binary and target WebAssembly:

ahc-dist --input-exe cabal-cfw-example --export-function=handleReq --no-main --input-mjs cabal_cfw_example.mjs --bundle --browser

cabal_cfw_example.mjs contains the following code:

import * as rts from "./rts.mjs"; import cabal_cfw_example from "./cabal_cfw_example.req.mjs"; async function handleRequest(req) { const i = await rts.newAsteriusInstance(Object.assign(cabal_cfw_example, { module: WASM })); const body = await req.text(); return await i.exports.handleReq(req.method, body); } addEventListener("fetch", event => { event.respondWith(handleRequest(event.request)) });

Finally, we can deploy our code to Workers by defining a metadata.json file and uploading the script and the WebAssembly module using Workers API as we did before.

Caveats

Workers limits your JavaScript and WebAssembly in file size. Therefore, you need to be careful with any dependencies you add.

Conclusion

Stack Builders builds better software for better living through technologies like expressive static types. We used Asterius to compile Haskell to WebAssembly and deployed it to Cloudflare Workers using the Workers API. Asterius supports a lot of Haskell features (e.g. Template Haskell) and it provides an easy-to-use JavaScript FFI to interact with JavaScript. Additionally, it provides prebuilt containers that contain a lot of Haskell packages, so you can start writing a script right away.

Following this approach, we can write functional type-safe code in Haskell, target it to WebAssembly, and publish it to Workers, which runs on the edge of the Cloudflare infrastructure.

For more content check our blogs and tutorials!

Categories: Technology

Know When You’ve Been DDoS’d

Mon, 05/10/2020 - 12:00
Know When You’ve Been DDoS’dKnow When You’ve Been DDoS’d

Today we’re announcing the availability of DDoS attack alerts. The alerts are available for free for all Cloudflare’s customers on paid plans.

Unmetered DDoS protection

Last week we celebrated Cloudflare’s 10th birthday in what we call Birthday Week. Every year, on each day of Birthday Week, we announce a new product with the goal of helping make the Internet a better place -- one that is safer and faster. To do that, over the years we’ve democratized many products that were previously only available to large enterprises by making them available for free (or at very low cost) to all. For example, on Cloudflare’s 7th birthday in 2017, we announced free unmetered DDoS protection as part of every Cloudflare product and every plan, including the free plan.

DDoS attacks aim to take down websites or online services and make them unavailable to the public. We wanted to make sure that every organization and every website is available and accessible, regardless if they can or can’t afford enterprise-grade DDoS protection. This has been a core part of our mission. We’ve been heavily investing in our DDoS protection capabilities over the last 10 years, and we will continue to do so in the future.

Real-time DDoS attack alerts

I’ve recently published a few blogs that provide a look under the hood of our DDoS protection systems. These systems run autonomously, they detect and mitigate attacks without any human intervention. As was the case with the 654 Gbps attack in July, and the 754 Mpps attack in June. We’ve been successful at blocking DDoS attacks and also providing our users with important analytics and insights about the attacks, but our customers also want to be notified in real-time when they are targeted by DDoS attacks.

So today, we’re excited to announce the availability of DDoS alerts. The current delivery methods by Cloudflare plan type are listed in the table below. Additional delivery methods will be made available in the future.

Delivery methods by plan

.tg {border-collapse:collapse;border-spacing:0;margin:0px auto;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:15px 20px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:15px 20px;word-break:normal;} .tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top} .tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
Delivery method Plan Free Pro Business Enterprise Email ❌ ✅ ✅ ✅ PagerDuty ❌ ❌ ✅ ✅

There are two types of DDoS alerts: HTTP DDoS alerts and L3/4 DDoS alerts. Whether you are eligible to one or both depends on the Cloudflare services that you are subscribed to. The table below lists the alert types by the Cloudflare service.

Alert types by service

.tg {border-collapse:collapse;border-spacing:0;margin:0px auto;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-9wq8{border-color:inherit;text-align:center;vertical-align:middle} .tg .tg-uzvj{border-color:inherit;font-weight:bold;text-align:center;vertical-align:middle}
Alert type Service WAF/CDN Spectrum Spectrum BYOIP Magic Transit HTTP DDoS alerts ✅ ❌ ❌ ❌ L3/4 DDoS alerts Coming soon Coming soon ✅ ✅

Creating a DDoS alert policy

In order to receive alerts on DDoS attacks that target your Cloudflare-protected Internet property, you must first create a notification policy. That’s fast and easy:

  1. Log in to your Cloudflare account dashboard: https://dash.cloudflare.com
  2. In the Account Home page, navigate to the Notifications tab
  3. In the Notifications card, click Create
  4. Give your notification a name, add an optional description, and the email addresses of the recipients.
Know When You’ve Been DDoS’d

If you are on the Business plan or higher, you’ll need to connect to PagerDuty before creating the alert policy. Once you’ve done so, you’ll have the option to send the alert to your PagerDuty service.

Receive the alert, view the attack, and give feedback

When developing and designing the alert template, we interviewed many of our customers to understand what information is important to them, what would make the alert useful and easy to understand. We’ve intentionally made the alert short. The email subject is also straightforward: DDoS Attack Detected, and it will only be sent from our official email address: noreply@notify.cloudflare[dot]com. Add this email to your list of trusted email addresses to assure you don’t miss the alerts.

The alert includes the following information:

  1. A short description of what happened
  2. The date and time the attack was initially detected and mitigated by our systems
  3. The attack type
  4. The max rate of the attack when the alert was triggered
  5. The attack target

The attack may be ongoing when you receive the alert and so we also include a link to view the attack in the Cloudflare dashboard and also a link to provide feedback on the protection and visibility.

Know When You’ve Been DDoS’dWe’d love to get your feedback!

We’d love your feedback on our DDoS protection solution. When you receive a DDoS alert, you’ll be provided with a link to submit your feedback. Measuring user satisfaction helps us build better products. Your feedback helps us measure user satisfaction for Cloudflare’s DDoS protection and the attack analytics that we provide in the dashboard. User satisfaction rates are one of the main Key Performance Indicators (KPIs) for our DDoS protection service that we monitor closely. So give your feedback, and help us make DDoS protection better for everyone.

Not a Cloudflare customer yet? Sign up to get started.

Categories: Technology

Birthday week: Cloudflare turns 10

Sun, 04/10/2020 - 13:03
 Cloudflare turns 10 Cloudflare turns 10

2020 marks a major milestone for Cloudflare: it’s our 10th birthday.

We’ve always used birthdays as an opportunity to give back to the Internet. But this year — a year in which the Internet has been so central to giving us all some degree of connectedness and normalcy — it feels like giving back to the Internet has been more important than ever.

And while we couldn’t celebrate in person, we were humbled by some of the incredible minds that joined us online to talk about how the Internet has changed over the last ten years — and what we might see over the next ten.

With that, let’s recap the key announcements from Birthday Week 2020.

Day 1, Monday: Workers

During Birthday Week in 2017, Cloudflare announced Workers — a serverless platform that represented a completely new way to build applications: by writing your code directly onto our network edge. On Monday of this year’s Birthday Week, we announced Durable Objects and Cron Triggers — both of which continue to expand the use cases that Workers can address.

Many folks associate the serverless paradigm with functions as a service — which, at its core, is stateless. Workers KV started down the path of changing this, providing high availability storage on the edge. However, there are use cases where consistency (a client making a request to a database will get the same view of data) is more important than availability (a client making a request to a database requests always receives a response). Say you want to sell tickets to a concert — you don’t want to allow two people to be able to purchase the same ticket.  With a traditional application, with a database running in one location, that’s relatively easy to ensure. But with Workers running in Cloudflare’s data centers all over the world, ensuring consistency is a little bit more challenging. Workers Durable Objects solves for this for developers: giving them access to high consistency storage when they’re building on the Workers platform.

Similarly, triggering Workers has historically needed a user to do something,  A user visiting a URL, for example. But developers have use cases when they want a Worker to run, independent of a user doing something right now. Syncing for example. Batch jobs. Or perhaps doing something 24 hours after a user has done something. And this is where Cron Triggers come in — now, for developers on the Workers platform, there’s no more need to rely on an eyeball to get things rolling.  

Day 2, Tuesday: Analytics

There are a lot of website analytics products out there on the market. Many of those products are, not surprisingly, very good.

But the way they’ve been implemented often leaves a lot to be desired. Most of them operate by tracking individual users, using client-side state like cookies or localStorage — or even fingerprints. This is increasingly a problem. There’s the principle of it: we don’t want to be tracked individually — why would we want visitors to our web properties to feel tracked either? Beyond that though, because so many people are feeling uncomfortable with how they’re being tracked around the web, they’re simply blocking a lot of these analytics products. As a result, all these analytics products are increasingly becoming less accurate.

On Tuesday, we announced a new Web Analytics product that allows you to get the best of both worlds — detailed and accurate analytics, without compromising on the privacy of your users. We don’t use any client-side state, like cookies or localStorage, for the purposes of tracking users. And we don’t “fingerprint” individuals via their IP address, User Agent string, or any other data for the purpose of displaying analytics (we consider fingerprinting even more intrusive than cookies, because users have no way to opt out). Because Cloudflare's business has never been built around tracking users or selling advertising, we don’t do it. Just the metrics, ma’am.

That wasn’t all on Tuesday, though. Another crucial aspect of owning a web property is website performance. Not only does it impact user experience, Google uses a blended measure of performance to inform site ranking in their search results. Google’s Chrome team has been doing some great work on metricizing site performance, and that’s culminated in Web Vitals. We’ve worked with the Chrome team to integrate Web Vitals in our Browser Insights product. You’ve always gotten edge-side performance analytics from Cloudflare, but now, you’re not just seeing the server side view of your web performance: it’s blended with how your users perceive performance, too. We take all that data and present it in a pragmatic way to help you figure out what you need to do to optimize the performance of your site.

Day 3, Wednesday: Cloudflare Radar and Speeding up HTTPS/HTTP3

As of today, Cloudflare sits in front of 14.5% of the world’s top 10 million websites. The privilege of getting to serve so many different customers means we get visibility into a lot of things on the web. Wednesday of birthday week was about us taking advantage of that for everyone who is out on the web today.

If you think about the traffic flowing through a city at any given time, it’s like a living, breathing creature. It ebbs and flows; it has rhythms that follow the sun and moon. Unusual events can cause traffic jams; as can accidents. Many cities have traffic reporting services for exactly this reason; knowing what’s going on can help immensely those that need to navigate the city streets. The web is like a global version of this, and given the role that the Internet now plays for humanity, understanding what’s going on probably equals in importance to all those city traffic reports all around the world.

And yet, when you want to get the equivalent of that traffic report, where do you go?

Cloudflare Radar is our answer to that question. Each second, Cloudflare handles on average 18 million HTTP requests and 6 million DNS requests. We block 72 billion cyberthreats every day. Add to that 1 billion unique IP addresses connecting to Cloudflare’s network, we have one of the most representative views on Internet traffic worldwide. Before Radar, all this activity, good and bad, was only available internally at Cloudflare: we used it to help improve our service and protect our customers. With the release of Radar, however, we’re exposing it externally: shining a light on the Internet’s patterns for the world to see.

On the subject of spotting interesting patterns. Back in late June, our team noticed a weird spike in DNS requests for the 65479 Resource Record. It turns out, these spikes were a part of Apple’s iOS14 beta release — Apple were testing out a new SVCB/HTTPS record type. The aim: to patch a limitation that’s been inherent in the HTTPS and HTTP3 protocol. When a user types in a URL without specifying the protocol (e.g. HTTPS), the initial negotiation happens in plaintext because browsers will start with HTTP. Only once it’s established that an HTTPS or HTTP3 resource exists will the browser transition over to that. The problem here is twofold: latency, and also security.

But you know what happens before any HTTP negotiation can happen? A DNS request. And that’s what Apple had implemented that created this interesting pattern: the DNS request was effectively asking whether the site supported HTTPS, or HTTP3. As of Wednesday during birthday week, Cloudflare’s DNS servers will now automatically generate HTTPS records on the fly to advertise whether a particular zone supports HTTP/3 and/or HTTP/2, based on whether those features are enabled on the zone. The result: better performance, and improved security. Who says you need to pick just one?

Day 4, Thursday: API day

Nobody has ever doubted the importance of user interfaces. Finding ways for humans and computers to engage each other has been an area of focus since the very first computers were invented. But as the web has grown, data has become the new oil, and applications have proliferated, there’s another interface that has grown in importance: the interface between different types of applications. Day 4 of Birthday Week was all about APIs.

The first announcement was beta support for gRPC: a new type of protocol that’s intended for building APIs at scale. Most REST APIs use HTTPS and JSON to communicate values. The problem with these is that they’re really designed for that other type of interface mentioned above: for humans to talk to computers. The upside is it makes things human readable; the downside is they’re really inefficient, and as the use of APIs only continues to explode this inefficiency proliferates. The gRPC protocol is an answer to this: it’s an efficient protocol for computers to talk to each other. But up until now, that also came at a price: because gRPC uses newer technology (like HTTP/2) under the covers, existing security and performance tools did not support gRPC traffic out of the box. This meant that customers adopting gRPC to power their APIs had to pick between modernity on one hand, and things like security, performance, and reliability on the other.

Cloudflare’s announcement of support of gPRC fixes this trade-off: when you put your gPRC APIs on Cloudflare, you get all the traditional benefits of Cloudflare along with it. Apprehensive of exposing your APIs to bad actors? Need more performance? Turn on Argo Smart Routing to decrease time to first byte. Increase reliability by adding a Load Balancer. Or add security features such as Bot Management and the WAF.

Speaking of the WAF. If you think about the way our WAF works, it secures web application from attacks by looking for attack patterns — say, bot patterns that try to imitate human patterns, or abuse of how a browser interacts with a site; in both instances, the attack is intended to break something. But because what computers need to talk to each other is different from what computers need to talk to humans, the attack vectors are different. Therefore protecting APIs isn’t quite the same as protecting websites.

API Shield is purpose-built for just this. It makes it simple to secure APIs through the use of strong client certificate-based authentication, and strict schema-based validation. On the authentication side, API shield uses mutual TLS — which is not vulnerable to the reuse or sharing of passwords or tokens. And once developers can be sure that only legitimate clients (with SSL certificates in hand) are connecting to their APIs, the next step in API Shield is making sure that those clients are making valid requests. It works by matching the contents of API requests—the query parameters that come after the URL and contents of the POST body—against a contract or “schema” that contains the rules for what is expected. If validation fails, the API call is blocked protecting the origin from an invalid request or a malicious payload.

And, as you’d expect from Cloudflare, gRPC and API Shield support each other out of the box.

Day 5, Friday: Automatic Platform Optimization (starting with WordPress)

The idea of caching static assets is not new, and it’s something Cloudflare has supported from its inception. It works wonders in speeding up websites: particularly if your origin is slow and/or your user is far from the origin server, then all your performance metrics will be affected. Caching also also has the added benefit of reducing load on origin servers.

However, things get a little more tricky when it comes to dynamic assets: if the asset could change, shouldn’t you go back to the origin just to make sure? For this reason, by default, Cloudflare doesn’t cache HTML content: there’s a chance it’s going to change for each user. The reality is though, most HTML isn’t really dynamic. It needs to be able to change relatively quickly when the site is updated but for a huge portion of the web, the content is static for months or years at a time. There are special cases like when a user is logged-in (as the admin or otherwise) where the content needs to differ but the vast majority of visits are of anonymous users.

Automatic Platform Optimization, which was announced on Friday, brings more intelligence to this — allowing us to figure out when we should be caching HTML, and when we shouldn’t. The advantage of this is it moves more content closer to the user, and it does it automagically — there’s no configuration required. The benefits aren’t trivial: a 72% reduction in Time to First Byte (TTFB), 23% reduction to First Contentful Paint, and 13% reduction in Speed Index for desktop users at the 90th percentile. We’re starting off with support for WordPress — 38% of all websites, but the plan is to expand this to other platforms in the near future.

All day, every day: Cloudflare TV

Ten years is a long time. The milestone for Cloudflare seemed to be the perfect opportunity to look back over the last ten years of the Internet — what’s changed, what’s surprised us? And more than that: what’s coming over the next ten years?

To look back and then peer out into the future, we were humbled to be joined by some of the most celebrated names in tech and beyond. Among the highlights: Apple co-founder Steve Wozniak, Zoom CEO Eric Yuan, OpenTable CEO Debby Soo, Stripe co-founder and President John Collison, Former CEO & Executive Chairman of Google and Co-Founder of Schmidt Futures Eric Schmidt, former McAfee CEO Chris Young, former Seal Team 6 Commander Dave Cooper, Project Include CEO Ellen Pao, and so many more. All told, it was  24 hours of live discussions over the course of the week.

And with that, it’s a wrap! To everyone who has been a part of the Cloudflare journey over the past 10 years: our customers, folks on the team, friends and supporters, and our partners all around the world: thank you. It’s been an incredible ride.

And, as our co-founder Michelle likes to say, we’re just getting started.

Categories: Technology

Introducing support for the AVIF image format

Sat, 03/10/2020 - 14:00
Introducing support for the AVIF image formatIntroducing support for the AVIF image format

We've added support for the new AVIF image format in Image Resizing. It compresses images significantly better than older-generation formats such as WebP and JPEG. It's supported in Chrome desktop today, and support is coming to other Chromium-based browsers, as well as Firefox.

What’s the benefit?

More than a half of an average website's bandwidth is spent on images. Improved image compression can save bandwidth and improve overall performance of the web. The compression in AVIF is so good that images can reduce to half the size of JPEG and WebP

What is AVIF?

AVIF is a combination of the HEIF ISO standard, and a royalty-free AV1 codec by Mozilla, Xiph, Google, Cisco, and many others.

Currently JPEG is the most popular image format on the Web. It's doing remarkably well for its age, and it will likely remain popular for years to come thanks to its excellent compatibility. There have been many previous attempts at replacing JPEG, such as JPEG 2000, JPEG XR and WebP. However, these formats offered only modest compression improvements, and didn't always beat JPEG on image quality. Compression and image quality in AVIF is better than in all of them, and by a wide margin.

Introducing support for the AVIF image format Introducing support for the AVIF image format Introducing support for the AVIF image format JPEG (17KB) WebP (17KB) AVIF (17KB) Why a new image format?

One of the big things AVIF does is it fixes WebP's biggest flaws. WebP is over 10 years old, and AVIF is a major upgrade over the WebP format. These two formats are technically related: they're both based on a video codec from the VPx family. WebP uses the old VP8 version, while AVIF is based on AV1, which is the next generation after VP10.

WebP is limited to 8-bit color depth, and in its best-compressing mode of operation, it can only store color at half of the image's resolution (known as chroma subsampling). This causes edges of saturated colors to be smudged or pixelated in WebP. AVIF supports 10- and 12-bit color at full resolution, and high dynamic range (HDR).

Introducing support for the AVIF image formatJPEG Introducing support for the AVIF image formatWebP Introducing support for the AVIF image formatWebP (sharp YUV option) Introducing support for the AVIF image formatAVIF

AV1 also uses a new compression technique: chroma-from-luma. Most image formats store brightness separately from color hue. AVIF uses the brightness channel to guess what the color channel may look like. They are usually correlated, so a good guess gives smaller file sizes and sharper edges.

Adoption of AV1 and AVIF

VP8 and WebP suffered from reluctant industry adoption. Safari only added WebP support very recently, 10 years after Chrome.

Chrome 85 supports AVIF already. It’s a work in progress in other Chromium-based browsers, and Firefox. Apple hasn't announced whether Safari will support AVIF. However, this time Apple is one of the companies in the Alliance for Open Media, creators of AVIF. The AV1 codec is already seeing faster adoption than the previous royalty-free codecs. Latest GPUs from Nvidia, AMD, and Intel already have hardware decoding support for AV1.

AVIF uses the same HEIF container as the HEIC format used in iOS’s camera. AVIF and HEIC offer similar compression performance. The difference is that HEIC is based on a commercial, patent-encumbered H.265 format. In countries that allow software patents, H.265 is illegal to use without obtaining patent licenses. It's covered by thousands of patents, owned by hundreds of companies, which have fragmented into two competing licensing organizations. Costs and complexity of patent licensing used to be acceptable when videos were published by big studios, and the cost could be passed on to the customer in the price of physical discs and hardware players. Nowadays, when video is mostly consumed via free browsers and apps, the old licensing model has become unsustainable. This has created a huge incentive for Web giants like Google, Netflix, and Amazon to get behind the royalty-free alternative. AV1 is free to use by anyone, and the alliance of tech giants behind it will defend it from patent troll's lawsuits.

Non-standard image formats usually can only be created using their vendor's own implementation. AVIF and AV1 are already ahead with multiple independent implementations: libaom, Intel SVT-AV1, libgav1, dav1d, and rav1e. This is a sign of a healthy adoption and a prerequisite to be a Web standard. Our Image Resizing is implemented in Rust, so we've chosen the rav1e encoder to create a pure-Rust implementation of AVIF.

Caveats

AVIF is a feature-rich format. Most of its features are for smartphone cameras, such as "live" photos, depth maps, bursts, HDR, and lossless editing. Browsers will support only a fraction of these features. In our implementation in Image Resizing we’re supporting only still standard-range images.

Two biggest problems in AVIF are encoding speed and lack of progressive rendering.

AVIF images don't show anything on screen until they're fully downloaded. In contrast, a progressive JPEG can display a lower-quality approximation of the image very quickly, while it's still loading. When progressive JPEGs are delivered well, they make websites appear to load much faster. Progressive JPEG can look loaded at half of its file size. AVIF can fully load at half of JPEG's size, so it somewhat overcomes the lack of progressive rendering with the sheer compression advantage. In this case only WebP is left behind, which has neither progressive rendering nor strong compression.

Decoding AVIF images for display takes relatively more CPU power than other codecs, but it should be fast enough in practice. Even low-end Android devices can play AV1 videos in full HD without help of hardware acceleration, and AVIF images are just a single frame of an AV1 video. However, encoding of AVIF images is much slower. It may take even a few seconds to create a single image. AVIF supports tiling, which accelerates encoding on multi-core CPUs. We have lots of CPU cores, so we take advantage of this to make encoding faster. Image Resizing doesn’t use the maximum compression level possible in AVIF to further increase compression speed. Resized images are cached, so the encoding speed is noticeable only on a cache miss.

We will be adding AVIF support to Polish as well. Polish converts images asynchronously in the background, which completely hides the encoding latency, and it will be able to compress AVIF images better than Image Resizing.

Note about benchmarking

It's surprisingly difficult to fairly and accurately judge which lossy codec is better. Lossy codecs are specifically designed to mainly compress image details that are too subtle for the naked eye to see, so "looks almost the same, but the file size is smaller!" is a common feature of all lossy codecs, and not specific enough to draw conclusions about their performance.

Lossy codecs can't be compared by comparing just file sizes. Lossy codecs can easily make files of any size. Their effectiveness is in the ratio between file size and visual quality they can achieve.

The best way to compare codecs is to make each compress an image to the exact same file size, and then to compare the actual visual quality they've achieved. If the file sizes differ, any judgement may be unfair, because the codec that generated the larger file may have done so only because it was asked to preserve more details, not because it can't compress better.

How and when to enable AVIF today?

We recommend AVIF for websites that need to deliver high-quality images with as little bandwidth as possible. This is important for users of slow networks and in countries where the bandwidth is expensive.

Browsers that support the AVIF format announce it by adding image/avif to their Accept request header. To request the AVIF format from Image Resizing, set the format option to avif.

The best method to detect and enable support for AVIF is to use image resizing in Workers:

addEventListener('fetch', event => { const imageURL = "https://jpeg.speedcf.com/cat/4.jpg"; const resizingOptions = { // You can set any options here, see: // https://developers.cloudflare.com/images/worker width: 800, sharpen: 1.0, }; const accept = event.request.headers.get("accept"); const isAVIFSupported = /image\/avif/.test(accept); if (isAVIFSupported) { resizingOptions.format = "avif"; } event.respondWith(fetch(imageURL), {cf:{image: resizingOptions}}) })

The above script will auto-detect the supported format, and serve AVIF automatically. Alternatively, you can use URL-based resizing together with the <picture> element:

<picture> <source type="image/avif" srcset="/cdn-cgi/image/format=avif/image.jpg"> <img src="/image.jpg"> </picture>

This uses our /cdn-cgi/image/… endpoint to perform the conversion, and the alternative source will be picked only by browsers that support the AVIF format.

We have the format=auto option, but it won't choose AVIF yet. We're cautious, and we'd like to test the new format more before enabling it by default.

Categories: Technology

Introducing Automatic Platform Optimization, starting with WordPress

Fri, 02/10/2020 - 14:01
Introducing Automatic Platform Optimization, starting with WordPressIntroducing Automatic Platform Optimization, starting with WordPress

Today, we are announcing a new service to serve more than just the static content of your website with the Automatic Platform Optimization (APO) service. With this launch, we are supporting WordPress, the most popular website hosting solution serving 38% of all websites. Our testing, as detailed below, showed a 72% reduction in Time to First Byte (TTFB), 23% reduction to First Contentful Paint, and 13% reduction in Speed Index for desktop users at the 90th percentile, by serving nearly all of your website’s content from Cloudflare’s network. This means visitors to your website see not only the first content sooner but all content more quickly.

With Automatic Platform Optimization for WordPress, your customers won’t suffer any slowness caused by common issues like shared hosting congestion, slow database lookups, or misbehaving plugins. This service is now available for anyone using WordPress. It costs $5/month for customers on our Free plan and is included, at no additional cost, in our Professional, Business, and Enterprise plans. No usage fees, no surprises, just speed.

How to get started

The easiest way to get started with APO is from your WordPress admin console.

1. First, install the Cloudflare WordPress plugin on your WordPress website or update to the latest version (3.8.2 or higher).
2. Authenticate the plugin (steps here) to talk to Cloudflare if you have not already done that.
3. From the Home screen of the Cloudflare section, turn on Automatic Platform Optimization.

Free customers will first be directed to the Cloudflare Dashboard to purchase the service.

Why We Built This

At Cloudflare, we jump at the opportunity to make hard problems for our customers disappear with the click of a button. Running a consistently fast website is challenging. Many businesses don’t have the time nor money to spend on complicated and expensive performance solutions for their site. Even if they do, it can be extremely costly to pay for specialized attention to ensure you get the best performance possible. Having a fast website doesn’t have to be complicated, though. The closer your content is to your customers, the better your site will perform. Static content caching does this for files like images, CSS and JavaScript, but that is only part of the equation. Dynamic content is still fetched from the origin incurring costly round trips and additional processing time. For more info on dynamic versus static content, see our learning center.

WordPress is one of the most open platforms in the world, but that means you are always at risk of incurring performance penalties because of plugins or other sources that, while necessary, may be hard to pinpoint and resolve. With the Automatic Platform Optimization service, we put your website into our network that is within 10 milliseconds of 99% of the Internet-connected population in the developed world, all without having to change your existing hosting provider. This means that for most requests your customers won’t even need to go to your origin, reducing many costly round trips and server processing time. These optimizations run on our edge network, so they also will not impact render or interactivity since no additional JavaScript is run on the client.

How We Measure Web Performance

Evaluating performance of a website is difficult. There are many different metrics you can track and it is not always obvious which metrics most meaningfully represent a user’s experience. As discussed when we blogged about our new Speed page, we aim to simplify this for customers by automating tests using the infrastructure of webpagetest.org, and summarizing both the results visually and numerically in one place.

Introducing Automatic Platform Optimization, starting with WordPress

The visualization gives you a clear idea of what customers are going to see when they come to your site, and the Critical Loading Times provide the most important metrics to judge your performance. On top of seeing your site’s performance, we provide a list of recommendations for ways to even further increase your performance. If you are using WordPress, then we will test your site with Automatic Platform Optimizations to estimate the benefit you will get with the service.

The Benefits of Automatic Platform Optimization

We tested APO on over 500 Cloudflare customer websites that run on WordPress to understand what the performance improvements would be. The results speak for themselves:

Test Results

.tg {border-collapse:collapse;border-spacing:0;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-aa1j{background-color:#c0c0c0;border-color:inherit;font-size:18px;font-weight:bold;text-align:center;vertical-align:middle} .tg .tg-r5qm{border-color:inherit;font-size:18px;text-align:left;vertical-align:middle} Metric Percentiles Baseline Cloudflare APO Enabled Improvement (%) Time to First Byte (TTFB) 90th 1252 ms 351 ms 71.96% 10th 254 ms 261 ms -2.76% First Contentful Paint
(FCP) 90th 2655 ms 2056 ms 22.55% 10th 894 ms 783 ms 12.46% Speed Index
(SI) 90th 6428 5586 13.11% 10th 1301 1242 4.52%

Note: Results are based on test results of 505 randomly selected websites that are cached by Cloudflare. Tests were run using WebPageTest from South Carolina, USA and the following environment: Chrome, Cable connection speed.

Most importantly, with APO, a site’s TTFB is made both fast and consistent. Because we now serve the html from Cloudflare’s edge with 0 origin processing time, getting the first byte to the eyeball is consistently fast. Under heavy load, a WordPress origin can suffer delays in building the html and returning it to visitors. APO removes the variance due to load resulting in consistent TTFB <400 ms.

Additionally, between faster TTFB and additional caching of third party fonts, we see performance improvements in both FCP and SI for both the fastest and slowest of the sites we tested. Some of this comes naturally from reducing the TTFB, since every millisecond you shave off of TTFB is a potential millisecond gain for other metrics. Caching additional third party fonts allows us to reduce the time it takes to fetch that content. Given fonts can often block paints due to text rendering, this improves the rate at which the page paints, and improves the Speed Index.

We asked the folks at Kinsta to try out APO, given their expertise in WordPress, and tell us what they think. Brian Li, a Website Content Manager at Kinsta, ran a set of tests from around the world on a website hosted in Tokyo. I’ll let him explain what they did and the results:

At Kinsta, WordPress performance is something that’s near and dear to our hearts. So, when Cloudflare reached out about testing their new Automatic Platform Optimization (APO) service for WordPress, we were all ears.
 
This is what we did to test out the new service:

  1. We set up a test site in Tokyo, Japan – one of the 24 high-performance data center locations available for Kinsta customers.
  2. We ran several speed tests from six different locations around the world with and without Cloudflare’s APO.

 
The results were incredible!
 
By caching static HTML on Cloudflare’s edge network, we saw a 70-300% performance increase. As expected, the testing locations furthest away from Tokyo saw the biggest reduction in load time.
 
If your WordPress site uses a traditional CDN that only caches CSS, JS, and images, upgrading to Cloudflare’s WordPress APO is a no-brainer and will help you stay competitive with modern Jamstack and static sites that live on the edge by default.

Brian’s test results are summarized in this image:

Introducing Automatic Platform Optimization, starting with WordPressPage Load Speeds for loading a website hosted in Tokyo from 6 locations worldwide - comparing Kinsta, Kinsta with KeyCDN, and Kinsta with Cloudflare APO.

One of the clear benefits, from Kinsta’s testing of APO, is the consistency of performance for serving your site no matter where your visitors are in the world. The consistent sub-second performance shown with APO versus two or three second load times in other setups makes it clear that if you have a global customer base, APO delivers an improved experience for all visitors.

How Automatic Platform Optimization Works

Automatic Platform Optimization is the result of being able to use the power of Cloudflare Workers to intelligently cache dynamic content. By caching dynamic content, we can serve the entire website from our edge network. Think ‘static site’ but without any of the work of having to build or maintain a static site. Customers can keep managing and updating content on their website in the same way and leave the hard work for performance to us. Serving both static and dynamic content from our network results, generally, in no origin requests or origin processing time. This means all the communication occurs between the user’s device and our edge. Reducing the multitude of round trips typically required from our edge to the origin for dynamic content is what makes this service so effective. Let’s first see what it normally looks like to load a WordPress site for a visitor.

Introducing Automatic Platform Optimization, starting with WordPressA sequence diagram for a typical user visiting a site‌‌

In a regular request flow, Cloudflare is able to cache some of the content like images, CSS, or JS, while other requests go to either the origin or a third party service in order to fetch the content. Most importantly the first request to fetch the HTML for the site needs to go to the origin which is a typical cause of long TTFB, since no other requests get made until the client can receive the HTML and parse it to make subsequent requests.

Introducing Automatic Platform Optimization, starting with WordPressThe same site visit but with APO enabled.

Once APO is enabled, all those trips to the origin are removed. TTFB benefits greatly because the first hop starts and ends at Cloudflare’s network. This also means the browser starts working on fetching and painting the webpage sooner meaning each paint event occurs earlier. Last by caching third party fonts, we remove additional requests that would need to leave Cloudflare’s network and extend the time to display text to the user. Often, websites use fonts hosted on third-party domains. While this saves bandwidth costs that would be incurred from hosting it on the origin, depending on where those fonts are hosted, it can still be a costly operation to fetch them. By rehosting the fonts and serving them from our cache, we can reduce one of the remaining costly round trips.

With APO for WordPress, you can say bye bye to database congestion or unwieldy plugins slowing down your customers’ experience. These benefits are stacked on top of our already fast TLS connection times and industry leading protocol support like HTTP/2 that ensure we are using the most efficient and the fastest way to connect and deliver your website to your customers.

For customers with WordPress sites that support authenticated sessions, you do not have to worry about us caching content from authenticated users and serving it to others. We bypass the cache on standard WordPress and WooCommerce cookies for authenticated users. This ensures customized content for a specific user is only visible to that user. While this has been available to customers with our Business-level service, it is now available for any WordPress customer that enables APO.

You might be wondering: “This all sounds great, but what about when I change content on my site?” Because this service works in tandem with our WordPress plugin, we are able to understand when you make changes and ensure we quickly purge the content in Cloudflare’s edge and refresh it with the new content. With the plugin installed, we detect content changes and update our edge network worldwide with automatic cache purges. As part of this release, we have updated our WordPress plugin, so whether or not you use APO, you should upgrade to the latest version today. If you do not or cannot use our WordPress plugin, then APO will still provide the same performance benefits, but may serve stale content for up to 30 minutes and when the content is requested again.

This service was built on the prototype work originally blogged about here and here. For a more in depth look at the technical side of the service and how Cloudflare Workers allowed us to build the Automatic Platform Optimization service, see the accompanying blog post.

WordPress Today, Other Platforms Coming Soon

While today’s announcement is focused on supporting WordPress, this is just the start. We plan to bring these same capabilities to other popular platforms used for web hosting. If you operate a platform and are interested in how we may be able to work together to improve things for all your customers, please get in touch. If you are running a website, let us know what platform you want to see us bring Automatic Platform Optimization to next.

Categories: Technology

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

Fri, 02/10/2020 - 14:00
Building Automatic Platform Optimization for WordPress using Cloudflare Workers

This post explains how we implemented the Automatic Platform Optimization for WordPress. In doing so, we have defined a new place to run WordPress plugins, at the edge written with Cloudflare Workers. We provide the feature as a Cloudflare service but what’s exciting is that anyone could build this using the Workers platform.

The service is an evolution of the ideas explained in an earlier zero-config edge caching of HTML blog post. The post will explain how Automatic Platform Optimization combines the best qualities of the regular Cloudflare cache with Workers KV to improve cache cold starts globally.

The optimization will work both with and without the Cloudflare for WordPress plugin integration. Not only have we provided a zero config edge HTML caching solution but by using the Workers platform we were also able to improve the performance of Google font loading for all pages.

We are launching the feature first for WordPress specifically but the concept can be applied to any website and/or content management system (CMS).

A new place to run WordPress plugins?

There are many individual WordPress plugins for performance that use similar optimizations to existing Cloudflare services. Automatic Platform Optimization is bringing them all together into one easy to use solution, deployed at the edge.

Traditionally you have to maintain server plugins with your WordPress installation. This comes with maintenance costs and can require a deep understanding of how to fine tune performance and security for each and every plugin. Providing the optimizations on the client side can also lead to performance problems due to the costs of JavaScript execution. In contrast most of the optimizations could be built-in in Cloudflare’s edge rather than running on the server or the client. Automatic Platform Optimization will be always up to date with the latest performance and security best practices.

How to optimize for WordPress

By default Cloudflare CDN caches assets based on file extension and doesn’t cache HTML content. It is possible to configure HTML caching with a Cache Everything Page rule but it is a manual process and often requires additional features only available on the Business and Enterprise plans. So for the majority of the WordPress websites even with a CDN in front them, HTML content is not cached. Requests for a HTML document have to go all the way to the origin.

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

Even if a CDN optimizes the connection between the closest edge and the website’s origin, the origin could be located far away and also be slow to respond, especially under load.

Move content closer to the user

One of the primary recommendations for speeding up websites is to move content closer to the end-user. This reduces the amount of time it takes for packets to travel between the end-user and the web server - the round-trip time (RTT). This improves the speed of establishing a connection as well as serving content from a closer location.

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

We have previously blogged about the benefits of edge caching HTML. Caching and serving from HTML from the Cloudflare edge will greatly improve the time to first byte (TTFB) by optimizing DNS, connection setup, SSL negotiation, and removing the origin server response time.If your origin is slow in generating HTML and/or your user is far from the origin server then all your performance metrics will be affected.

Most HTML isn’t really dynamic. It needs to be able to change relatively quickly when the site is updated but for a huge portion of the web, the content is static for months or years at a time. There are special cases like when a user is logged-in (as the admin or otherwise) where the content needs to differ but the vast majority of visits are of anonymous users.

Zero config edge caching revisited

The goal is to make updating content to the edge happen automatically. The edge will cache and serve the previous version content until there is new content available. This is usually achieved by triggering a cache purge to remove existing content. In fact using a combination of our WordPress plugin and Cloudflare cache purge API, we already support Automatic Cache Purge on Website Updates. This feature has been in use for many years.

Building automatic HTML edge caching is more nuanced than caching traditional static content like images, styles or scripts. It requires defining rules on what to cache and when to update the content. To help with that task we introduced a custom header to communicate caching rules between Cloudflare edge and origin servers.

The Cloudflare Worker runs from every edge data center, the serverless platform will take care of scaling to our needs. Based on the request type it will return HTML content from Cloudflare Cache using Worker’s Cache API or serve a response directly from the origin. Specifically designed custom header provides information from the origin on how the script should handle the response. For example worker script will never cache responses for authenticated users.

HTML Caching rules

With or without Cloudflare for WordPress plugin, HTML edge caching requires all of the following conditions to be met:

  • Origin responds with 200 status
  • Origin responds with "text/html" content type
  • Request method is GET.
  • Request path doesn’t contain query strings
  • Request doesn’t contain any WordPress specific cookies: "wp-*", "wordpress*", "comment_*", "woocommerce_*" unless it’s "wordpress_eli" or "wordpress_test_cookie".
  • Request doesn’t contain any of the following headers:
    • "Cache-Control: no-cache"
    • "Cache-Control: private"
    • "Pragma:no-cache"
    • "Vary: *"

Note that the caching is bypassed if the devtools are open and the “Disable cache” option is active.

Edge caching with plugin

The preferred solution requires a configured Cloudflare for WordPress plugin. We provide the following features set when the plugin is activated:

  • HTML edge caching with 30 days TTL
  • 30 seconds or faster cache invalidation
  • Bypass HTML caching for logged in users
  • Bypass HTML caching based on presence of WordPress specific cookies
  • Decrease load on origin servers. If a request is fetched from Cloudflare CDN Cache we skip the request to the origin server.
How is this implemented?

When an eyeball requests a page from a website and Cloudflare doesn’t have a copy of the content it will be fetched from the origin. As the response is sent from the origin and goes through Cloudflare’s edge, Cloudflare for WordPress plugin adds a custom header: cf-edge-cache. It allows an origin to configure caching rules applied on responses.

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

Based on the X-HTML-Edge-Cache proposal the plugin adds a cf-edge-cache header to every origin response. There are 2 possible values:

  • cf-edge-cache: no-cache

The page contains private information that shouldn’t be cached by the edge. For example, an active session exists on the server.

  • cf-edge-cache: cache, platform=wordpress

This combination of cache and platform will ensure that the HTML page is cached. In addition, we ran a number of checks against the presence of WordPress specific cookies to make sure we either bypass or allow caching on the Edge.

If the header isn’t present we assume that the Cloudflare for WordPress plugin is not installed or up-to-date. In this case the feature operates without a plugin support.

Edge caching without plugin

Using the Automatic Platform Optimization feature in combination with Cloudflare for WordPress plugin is our recommended solution. It provides the best feature set together with almost instant cache invalidation. Still, we wanted to provide performance improvements without the need for any installation on the origin server.

We provide the following features set when the plugin is not activated:

  • HTML edge caching with 30 days TTL
  • Cache invalidation may take up to 30 minutes. A manual cache purge could be triggered to speed up cache invalidation
  • Bypass HTML caching based on presence of WordPress specific cookies
  • No decreased load on origin servers. If a request is fetched from Cloudflare CDN Cache we still require an origin response to apply cache invalidation logic.

Without Cloudflare for WordPress plugin we still cache HTML on the edge and serve the content from the cache when possible. The logic of cache revalidation happens after serving the response to the eyeball. Worker’s waitUntil() callback allows the user to run code without affecting the response to the eyeball and is run in background.

We rely on the following headers to detect whether the content is stale and requires cache update:

  • ETag. If the cached version and origin response both include ETag and they are different we replace cached version with origin response. The behavior is the same for strong and weak ETag values.
  • Last-Modified. If the cached version and origin response both include Last-Modified and origin has a later Last-Modified date we replace cached version with origin response.
  • Date. If no ETag or Last-Modified header is available we compare cached version and origin response Date values. If there was more than a 30 minutes difference we replace cached version with origin response.
Getting content everywhere

Cloudflare Cache works great for the frequently requested content. Regular requests to the site make sure the content stays in cache. For a typical personal blog, it will be more common that the content stays in cache only in some parts of our vast edge network. With the Automatic Platform Optimization release we wanted to improve loading time for cache cold start from any location in the world. We explored different approaches and decided to use Workers KV to improve Edge Caching.

In addition to Cloudflare's CDN cache we put the content into Workers KV. It only requires a single request to the page to cache it and within a minute it is made available to be read back from KV from any Cloudflare data center.

Updating content

After an update has been made to the WordPress website the plugin makes a request to Cloudflare’s API which both purges cache and marks content as stale in KV. The next request for the asset will trigger revalidation of the content. If the plugin is not enabled cache revalidation logic is triggered as detailed previously.

We serve the stale copy of the content still present in KV and asynchronously fetch new content from the origin, apply possible optimizations and then cache it (both regular local CDN cache and globally in KV).

Building Automatic Platform Optimization for WordPress using Cloudflare Workers

To store the content in KV we use a single namespace. It’s keyed with a combination of a zone identifier and the URL. For instance:

1:example.com/blog-post-1.html => "transformed & cached content"

For marking content as stale in KV we write a new key which will be read from the edge. If the key is present we will revalidate the content.

stale:1:example.com/blog-post-1.html => ""

Once the content was revalidated the stale marker key is deleted.

Moving optimizations to the edge

On top of caching HTML at the edge, we can pre-process and transform the HTML to make the loading of websites even faster for the user. Moving the development of this feature to our Cloudflare Workers environment makes it easy to add performance features such as improving Google Font loading. Using Google Fonts can cause significant performance issues as to load a font requires loading the HTML page; then loading a CSS file and finally loading the font. All of these steps are using different domains.

The solution is for the worker to inline the CSS and serve the font directly from the edge minimizing the number of connections required.

If you read through the previous blog post’s implementation it required a lot of manual work to provide streaming HTML processing support and character encodings. As the set of worker APIs have improved over time it is now much simpler to implement. Specifically the addition of a streaming HTML rewriter/parser with CSS-selector based API and the ability to suspend the parsing to asynchronously fetch a resource has reduced the code required to implement this from ~600 lines of source code to under 200.

export function transform(request, res) { return new HTMLRewriter() .on("link", { async element(e) { const src = e.getAttribute("href"); const rel = e.getAttribute("rel"); const isGoogleFont = src.startsWith("https://fonts.googleapis.com") if (isGoogleFont && rel === "stylesheet") { const media = e.getAttribute("media") || "all"; const id = e.getAttribute("id") || ""; try { const content = await fetchCSS(src, request); e.replace(styleTag({ media, id }, content), { html: true }); } catch (e) { console.error(e); } } } }) .transform(res); }

The HTML transformation doesn’t block the response to the user. It’s running as a background task which when complete will update kv and replace the global cached version.

Building Automatic Platform Optimization for WordPress using Cloudflare WorkersMaking edge publishing generic

We are launching the feature for WordPress specifically but the concept can be applied to any website and content management system (CMS).

Categories: Technology

DNS Flag Day 2020

Fri, 02/10/2020 - 08:35
DNS Flag Day 2020DNS Flag Day 2020

October 1 was this year’s DNS Flag Day. Read on to find out all about DNS Flag Day and how it affects Cloudflare’s DNS services (hint: it doesn’t, we already did the work to be compliant).

What is DNS Flag Day?

DNS Flag Day is an initiative by several DNS vendors and operators to increase the compliance of implementations with DNS standards. The goal is to make DNS more secure, reliable and robust. Rather than a push for new features, DNS flag day is meant to ensure that workarounds for non-compliance can be reduced and a common set of functionalities can be established and relied upon.

Last year’s flag day was February 1, and it set forth that servers and clients must be able to properly handle the Extensions to DNS (EDNS0) protocol (first RFC about EDNS0 are from 1999 - RFC 2671). This way, by assuming clients have a working implementation of EDNS0, servers can resort to always sending messages as EDNS0. This is needed to support DNSSEC, the DNS security extensions. We were, of course, more than thrilled to support the effort, as we’re keen to push DNSSEC adoption forward .

DNS Flag Day 2020

The goal for this year’s flag day is to increase DNS messaging reliability by focusing on problems around IP fragmentation of DNS packets. The intention is to reduce DNS message fragmentation which continues to be a problem. We can do that by ensuring cleartext DNS messages sent over UDP are not too large, as large messages risk being fragmented during the transport. Additionally, when sending or receiving large DNS messages, we have the ability to do so over TCP.

Problem with DNS transport over UDP

A potential issue with sending DNS messages over UDP is that the sender has no indication of the recipient actually receiving the message. When using TCP, each packet being sent is acknowledged (ACKed) by the recipient, and the sender will attempt to resend any packets not being ACKed. UDP, although it may be faster than TCP, does not have the same mechanism of messaging reliability. Anyone still wishing to use UDP as their transport protocol of choice will have to implement this reliability mechanism in higher layers of the network stack. For instance, this is what is being done in QUIC, the new Internet transport protocol used by HTTP/3 that is built on top of UDP.

Even the earliest DNS standards (RFC 1035) specified the use of sending DNS messages over TCP as well as over UDP. Unfortunately, the choice of supporting TCP or not was up to the implementer/operator, and then firewalls were sometimes set to block DNS over TCP. More recent updates to RFC 1035, on the other hand, require that the DNS server is available to query using DNS over TCP.

DNS message fragmentation

Sending data over networks and the Internet is restricted to the limitation of how large each packet can be. Data is chopped up into a stream of packets, and sized to adhere to the Maximum Transmission Unit (MTU) of the network. MTU is typically 1500 bytes for IPv4 and, in the case of IPv6, the minimum is 1280 bytes. Subtracting both the IP header size (IPv4 20 bytes/IPv6 40 bytes) and the UDP protocol header size (8 bytes) from the MTU, we end up with a maximum DNS message size of 1472 bytes for IPv4 and 1232 bytes in order for a message to fit within a single packet. If the message is any larger than that, it will have to be fragmented into more packets.

Sending large messages causes them to get fragmented into more than one pack. This is not a problem with TCP transports since each packet is ACK:ed to ensure proper delivery. However, the same does not hold true when sending large DNS messages over UDP. For many intents and purposes, UDP has been treated as a second-class citizen to TCP as far as network routing is concerned. It is quite common to see UDP packet fragments being dropped by routers and firewalls, potentially causing parts of a message to be lost. To avoid fragmentation over UDP it is better to truncate the DNS message and set the Truncation Flag in the DNS response. This tells the recipient that more data is available if the query is retried over TCP.

DNS Flag Day 2020 wants to ensure that DNS message fragmentation does not happen. When larger DNS messages need to be sent, we need to ensure it can be done reliably over TCP.

DNS servers need to support DNS message transport over TCP in order to be compliant with this year's flag day. Also, DNS messages sent over UDP must never exceed the limit over which they risk being fragmented.

Cloudflare authoritative DNS and 1.1.1.1

We fully support the DNS Flag Day initiative, as it aims to make DNS more reliable and robust, and it ensures a common set of features for the DNS community to evolve on. In the DNS ecosystem, we are as much a client as we are a provider. When we perform DNS lookups on behalf of our customers and users, we rely on other providers to follow standards and be compliant. When they are not, and we can’t work around the issues, it leads to problems resolving names and reaching resources.

Both our public resolver 1.1.1.1 as well as our authoritative DNS service, set and enforce reasonable limits on DNS message sizes when sent over UDP. Of course, both services are available over TCP. If you’re already using Cloudflare, there is nothing you need to do but to keep using our DNS services! We will continually work on improving DNS.

Oh, and you can test your domain on the DNS Flag Day site: https://dnsflagday.net/2020/

Categories: Technology

NTS is now an RFC

Thu, 01/10/2020 - 15:53
NTS is now an RFC

Earlier today the document describing Network Time Security for NTP officially became RFC 8915. This means that Network Time Security (NTS) is officially part of the collection of protocols that makes the Internet work. We’ve changed our time service to use the officially assigned port of 4460 for NTS key exchange, so you can use our service with ease. This is big progress towards securing a ubiquitous Internet protocol.

Over the past months we’ve seen many users of our time service, but very few using Network Time Security. This leaves computers vulnerable to attacks that imitate the server they use to obtain NTP. Part of the problem was the lack of available NTP daemons that supported NTS. That problem is now solved: chrony and ntpsec both support NTS.

Time underlies the security of many of the protocols such as TLS that we rely on to secure our online lives. Without accurate time, there is no way to determine whether or not credentials have expired. The absence of an easily deployed secure time protocol has been a problem for Internet security.

Without NTS or symmetric key authentication there is no guarantee that your computer is actually talking NTP with the computer you think it is. Symmetric key authentication is difficult and painful to set up, but until recently has been the only secure and standardized mechanism for authenticating NTP.  NTS uses the work that goes into the Web Public Key Infrastructure to authenticate NTP servers and ensure that when you set up your computer to talk to time.cloudflare.com, that’s the server your computer gets the time from.

Our involvement in developing and promoting NTS included making a specialized server and releasing the source code, participation in the standardization process, and much working with implementers to hunt down bugs. We also set up our time service with support for NTS from the beginning, and it was a useful resource for implementers to test interoperability.

NTS is now an RFCNTS operation diagram

When Cloudflare supported TLS 1.3 browsers were actively updating, and so deployment quickly took hold. However, the long tail of legacy installs and extended support releases slowed adoption. Similarly until Let’s Encrypt made encryption easy for webservers most web traffic was not encrypted.

By contrast ssh quickly displaced telnet as the way to access remote systems: the security benefits were substantial, and the experience was better. Adoption of protocols is slow, but when there is a real security need it can be much faster. NTS is a real security improvement that is vital to adopt. We’re proud to continue making the Internet a better place by supporting secure protocols.

We hope that operating systems will incorporate NTS support and TLS 1.3 in their supplied NTP daemons. We also urge administrators to deploy NTS as quickly as possible, and NTP server operators to adopt NTS. With Let’s Encrypt provided certificates this is simpler than it has been in the past

We’re continuing our work in this area with the continued development of the Roughtime protocol for even better security as well as engagement with the standardization process to help develop the future of Internet time.

Cloudflare is willing to allow any device to point to time.cloudflare.com and supports NTS. Just as our Universal SSL made it easy for any website to get the security benefits of TLS, our time service makes it easy for any computer to get the benefits of secure time.

Categories: Technology

Introducing API Shield

Thu, 01/10/2020 - 14:01
Introducing API Shield

APIs are the lifeblood of modern Internet-connected applications. Every millisecond they carry requests from mobile applications—place this food delivery order, “like” this picture—and directions to IoT devices—unlock the car door, start the wash cycle, my human just finished a 5k run—among countless other calls.

They’re also the target of widespread attacks designed to perform unauthorized actions or exfiltrate data, as data from Gartner increasingly shows: “by 2021, 90% of web-enabled applications will have more surface area for attack in the form of exposed APIs rather than the UI, up from 40% in 2019, and “Gartner predicted that, by 2022, API abuses will move from an infrequent to the most-frequent attack vector, resulting in data breaches for enterprise web applications”[1][2]. Of the 18 million requests per second that traverse Cloudflare’s network, 50% are directed towards APIs—with the majority of these requests blocked as malicious.

To combat these threats, Cloudflare is making it simple to secure APIs through the use of strong client certificate-based identity and strict schema-based validation. As of today, these capabilities are available free for all plans within our new “API Shield” offering. And as of today, the security benefits also extend to gRPC-based APIs, which use binary formats such as protocol buffers rather than JSON, and have been growing in popularity with our customer base.

Introducing API Shield

Continue reading to learn more about the new capabilities, or jump right to the "Demonstration" paragraph for examples of how to get started configuring your first API Shield rule.

Positive security models and client certificates

A “positive security” model is one that allows only known behavior and identities, while rejecting everything else. It is the opposite of the traditional “negative security” model enforced by a Web Application Firewall (WAF) that allows everything except for requests coming from problematic IPs, ASNs, countries or requests with problematic signatures (SQL injection attempts, etc.).

Implementing a positive security model for APIs is the most direct way to eliminate the noise of credential stuffing attacks and other automated scanning tools. And the first step towards a positive model is deploying strong authentication such as mutual TLS authentication, which is not vulnerable to the reuse or sharing of passwords.

Just as we simplified the issuance of server certificates back in 2014 with Universal SSL, API Shield reduces the process of issuing client certificates to clicking a few buttons in the Cloudflare Dashboard. By providing a fully hosted private public key infrastructure (PKI), you can focus on your applications and features—rather than operating and securing your own certificate authority (CA).

Introducing API ShieldEnforcing valid requests with schema validation

Once developers can be sure that only legitimate clients (with SSL certificates in hand) are connecting to their APIs, the next step in implementing a positive security model is making sure that those clients are making valid requests. Extracting a client certificate from a device and reusing elsewhere is difficult, but not impossible, so it’s also important to make sure that the API is being called as intended.

Requests containing extraneous input may not have been anticipated by the API developer, and can cause problems if processed directly by the application, so these should be dropped at the edge if possible. API Schema validation works by matching the contents of API requests—the query parameters that come after the URL and contents of the POST body—against a contract or “schema” that contains the rules for what is expected. If validation fails, the API call is blocked protecting the origin from an invalid request or a malicious payload.

Schema validation is currently in closed beta for JSON payloads, with gRPC/protocol buffer support on the roadmap. If you would like to join the beta please open a support ticket with the subject “API Schema Validation Beta”. After the beta has ended, we plan to make schema validation available as part of the API Shield user interface.

Introducing API ShieldDemonstration

To demonstrate how the APIs powering IoT devices and mobile applications can be secured, we have built an API Shield demonstration using client certificates and schema validation.

Temperatures are captured by an IoT device, represented in the demo by a Raspberry Pi 3 Model B+ with an external infrared temperature sensor, and then transmitted via a POST request to a Cloudflare-protected API. Temperatures are subsequently retrieved by GET requests and then displayed in a mobile application built in Swift for iOS.

In both cases, the API was actually built using Cloudflare Workers® and Workers KV, but can be replaced by any Internet-accessible endpoint.

1. API Configuration

Before configuring the IoT device and mobile application to communicate securely with the API, we need to bootstrap the API endpoints. To keep the example simple, while also allowing for additional customization, we’ve implemented the API as a Cloudflare Worker (borrowing code from the To-Do List tutorial).

In this particular example the temperatures are stored in Workers KV using the source IP address as a key, but this could easily be replaced by a value from the client certificate, e.g., the fingerprint. The code below saves a temperature and timestamp into KV when a POST is made, and returns the most recent 5 temperatures when a GET request is made.

const defaultData = { temperatures: [] } const getCache = key => TEMPERATURES.get(key) const setCache = (key, data) => TEMPERATURES.put(key, data) async function addTemperature(request) { // pull previously recorded temperatures for this client const ip = request.headers.get('CF-Connecting-IP') const cacheKey = `data-${ip}` let data const cache = await getCache(cacheKey) if (!cache) { await setCache(cacheKey, JSON.stringify(defaultData)) data = defaultData } else { data = JSON.parse(cache) } // append the recorded temperatures with the submitted reading (assuming it has both temperature and a timestamp) try { const body = await request.text() const val = JSON.parse(body) if (val.temperature && val.time) { data.temperatures.push(val) await setCache(cacheKey, JSON.stringify(data)) return new Response("", { status: 201 }) } else { return new Response("Unable to parse temperature and/or timestamp from JSON POST body", { status: 400 }) } } catch (err) { return new Response(err, { status: 500 }) } } function compareTimestamps(a,b) { return -1 * (Date.parse(a.time) - Date.parse(b.time)) } // return the 5 most recent temperature measurements async function getTemperatures(request) { const ip = request.headers.get('CF-Connecting-IP') const cacheKey = `data-${ip}` const cache = await getCache(cacheKey) if (!cache) { return new Response(JSON.stringify(defaultData), { status: 200, headers: { 'content-type': 'application/json' } }) } else { data = JSON.parse(cache) const retval = JSON.stringify(data.temperatures.sort(compareTimestamps).splice(0,5)) return new Response(retval, { status: 200, headers: { 'content-type': 'application/json' } }) } } async function handleRequest(request) { if (request.method === 'POST') { return addTemperature(request) } else { return getTemperatures(request) } } addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)) })

Before adding mutual TLS authentication, we’ll test POST’ing a random temperature reading:

$ TEMPERATURE=$(echo $((361 + RANDOM %11)) | awk '{printf("%.2f",$1/10.0)}') $ TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ") $ echo -e "$TEMPERATURE\n$TIMESTAMP" 36.30 2020-09-28T02:57:49Z $ curl -v -H "Content-Type: application/json" -d '{"temperature":'''$TEMPERATURE''', "time": "'''$TIMESTAMP'''"}' https://shield.upinatoms.com/temps 2>&1 | grep "< HTTP/2" < HTTP/2 201

And here’s a subsequent read of that temperature, along with the previous 4 that were submitted:

$ curl -s https://shield.upinatoms.com/temps | jq . [ { "temperature": 36.3, "time": "2020-09-28T02:57:49Z" }, { "temperature": 36.7, "time": "2020-09-28T02:54:56Z" }, { "temperature": 36.2, "time": "2020-09-28T02:33:08Z" }, { "temperature": 36.5, "time": "2020-09-28T02:29:22Z" }, { "temperature": 36.9, "time": "2020-09-28T02:27:19Z" } ] 2. Client certificate issuance

With our API in hand, it’s time to lock it down to require a valid client certificate. Before doing so we’ll want to generate those certificates. To do so, you can either go to the SSL/TLS → Client Certificates tab of the Cloudflare Dashboard and click “Create Certificate” or you can automate the process via API calls.

Because most developers at scale will be generating their own private keys and CSRs and requesting that they be signed via API, we’ll show that process here. Using Cloudflare’s PKI toolkit CFSSL we’ll first create a bootstrap certificate for the iOS application, and then we’ll create a certificate for the IoT device:

$ cat <<'EOF' | tee -a csr.json { "hosts": [ "ios-bootstrap.devices.upinatoms.com" ], "CN": "ios-bootstrap.devices.upinatoms.com", "key": { "algo": "rsa", "size": 2048 }, "names": [{ "C": "US", "L": "Austin", "O": "Temperature Testers, Inc.", "OU": "Tech Operations", "ST": "Texas" }] } EOF $ cfssl genkey csr.json | cfssljson -bare certificate 2020/09/27 21:28:46 [INFO] generate received request 2020/09/27 21:28:46 [INFO] received CSR 2020/09/27 21:28:46 [INFO] generating key: rsa-2048 2020/09/27 21:28:47 [INFO] encoded CSR $ mv certificate-key.pem ios-key.pem $ mv certificate.csr ios.csr // and do the same for the IoT sensor $ sed -i.bak 's/ios-bootstrap/sensor-001/g' csr.json $ cfssl genkey csr.json | cfssljson -bare certificate ... $ mv certificate-key.pem sensor-key.pem $ mv certificate.csr sensor.csrGenerate a private key and CSR for the IoT device and iOS application// we need to replace actual newlines in the CSR with ‘\n’ before POST’ing $ CSR=$(cat ios.csr | perl -pe 's/\n/\\n/g') $ request_body=$(< <(cat <<EOF { "validity_days": 3650, "csr":"$CSR" } EOF )) // save the response so we can view it and then extract the certificate $ curl -H 'X-Auth-Email: YOUR_EMAIL' -H 'X-Auth-Key: YOUR_API_KEY' -H 'Content-Type: application/json' -d “$request_body” https://api.cloudflare.com/client/v4/zones/YOUR_ZONE_ID/client_certificates > response.json $ cat response.json | jq . { "success": true, "errors": [], "messages": [], "result": { "id": "7bf7f70c-7600-42e1-81c4-e4c0da9aa515", "certificate_authority": { "id": "8f5606d9-5133-4e53-b062-a2e5da51be5e", "name": "Cloudflare Managed CA for account 11cbe197c050c9e422aaa103cfe30ed8" }, "certificate": "-----BEGIN CERTIFICATE-----\nMIIEkzCCA...\n-----END CERTIFICATE-----\n", "csr": "-----BEGIN CERTIFICATE REQUEST-----\nMIIDITCCA...\n-----END CERTIFICATE REQUEST-----\n", "ski": "eb2a48a19802a705c0e8a39489a71bd586638fdf", "serial_number": "133270673305904147240315902291726509220894288063", "signature": "SHA256WithRSA", "common_name": "ios-bootstrap.devices.upinatoms.com", "organization": "Temperature Testers, Inc.", "organizational_unit": "Tech Operations", "country": "US", "state": "Texas", "location": "Austin", "expires_on": "2030-09-26T02:41:00Z", "issued_on": "2020-09-28T02:41:00Z", "fingerprint_sha256": "84b045d498f53a59bef53358441a3957de81261211fc9b6d46b0bf5880bdaf25", "validity_days": 3650 } } $ cat response.json | jq .result.certificate | perl -npe 's/\\n/\n/g; s/"//g' > ios.pem // now ask that the second client certificate signing request be signed $ CSR=$(cat sensor.csr | perl -pe 's/\n/\\n/g') $ request_body=$(< <(cat <<EOF { "validity_days": 3650, "csr":"$CSR" } EOF )) $ curl -H 'X-Auth-Email: YOUR_EMAIL' -H 'X-Auth-Key: YOUR_API_KEY' -H 'Content-Type: application/json' -d "$request_body" https://api.cloudflare.com/client/v4/zones/YOUR_ZONE_ID/client_certificates | perl -npe 's/\\n/\n/g; s/"//g' > sensor.pemAsk Cloudflare to sign the CSRs with the private CA issued for your zone3. API Shield rule creation

With certificates in hand we can now configure the API endpoint to require their use. Below is a demonstration of how to create such a rule.

The steps include specifying which hostnames to prompt for certificates, e.g., shield.upinatoms.com, and then creating the API Shield rule.

Introducing API Shield4. IoT Device Communication

To prepare the IoT device for secure communication with our API endpoint we need to embed the certificate on the device, and then point our application to it so it can be used when making the POST request to the API endpoint.

We securely copied the private key and certificate into /etc/ssl/private/sensor-key.pem and /etc/ssl/certs/sensor.pem, and then modified our sample script to point to these files:

import requests import json from datetime import datetime def readSensor(): # Takes a reading from a temperature sensor and store it to temp_measurement dateTimeObj = datetime.now() timestampStr = dateTimeObj.strftime(‘%Y-%m-%dT%H:%M:%SZ’) measurement = {'temperature':str(36.5),'time':timestampStr} return measurement def main(): print("Cloudflare API Shield [IoT device demonstration]") temperature = readSensor() payload = json.dumps(temperature) url = 'https://shield.upinatoms.com/temps' json_headers = {'Content-Type': 'application/json'} cert_file = ('/etc/ssl/certs/sensor.pem', '/etc/ssl/private/sensor-key.pem') r = requests.post(url, headers = json_headers, data = payload, cert = cert_file) print("Request body: ", r.request.body) print("Response status code: %d" % r.status_code)

When the script attempts to connect to https://shield.upinatoms.com/temps, Cloudflare requests that a ClientCertificate is sent, and our script sends the contents of sensor.pem before demonstrating it has possession of sensor-key.pem as required to complete the SSL/TLS handshake.

If we fail to send the client certificate or attempt to include extraneous fields in the API request, the schema validation (configuration not shown) fails and the request is rejected:

Cloudflare API Shield [IoT device demonstration] Request body: {"temperature": "36.5", "time": "2020-09-28T15:52:19Z"} Response status code: 403

If instead a valid certificate is presented and the payload follows the schema previously uploaded, our script POSTs the latest temperature reading to the API.

Cloudflare API Shield [IoT device demonstration] Request body: {"temperature": "36.5", "time": "2020-09-28T15:56:45Z"} Response status code: 2015. Mobile Application (iOS) Communication

Now that temperature requests have been sent to our API endpoint, it’s time to read them securely from our mobile application using one of the client certificates.

For purposes of brevity, we’re going to embed a “bootstrap” certificate and key as a PKCS#12 file within the application bundle. In a real world deployment, this bootstrap certificate should only be used alongside users’ credentials to authenticate to an API endpoint that can return a unique user certificate. Corporate users will want to use MDM to distribute certificates for additional control and persistence options.

Package the certificate and private key

Before adding the bootstrap certificate and private key, we need to combine them into a binary PKCS#12 file. This binary file will then be added to our iOS application bundle.

$ openssl pkcs12 -export -out bootstrap-cert.pfx -inkey ios-key.pem -in ios.pem Enter Export Password: Verifying - Enter Export Password:Add the certificate bundle to your iOS application

Within XCode, click File → Add Files To "[Project Name]" and select your .pfx file. Make sure to check "Add to target" before confirming.

Modify your URLSession code to use the client certificate

This article provides a nice walkthrough of using a PKCS#11 class and URLSessionDelegate  to modify your application to complete mutual TLS authentication when connecting to an API that requires it.

Looking Forward

In the coming months, we plan to expand API Shield with a number of additional features designed to protect API traffic. For customers that want to use their own PKI, we will provide the ability to import their own CAs, something available today as part of Cloudflare Access.

As we receive feedback on our schema validation beta, we will look to make the capability generally available to all customers. If you’re trying out the beta and have thoughts to share, we’d love to hear your feedback.

Beyond certificates and schema validation, we’re excited to layer on additional API security capabilities as well as deep analytics to help you better understand your APIs. If there are features you’d like to see, let us know in the comments below!

1: “By 2021, 90% of web-enabled applications will have more surface area for attack in the form of exposed APIs rather than the UI, up from 40% in 2019. Source: Gartner “Gartner’s API Strategy Maturity Model”, Saniye Alaybeyi, Mark O'Neill, October 21, 2019. (Gartner subscription required)

2: “Gartner predicted by 2022, API abuses will move from an infrequent to the most-frequent attack vector, resulting in data breaches for enterprise web applications. Source: Gartner “Cool Vendors in API Strategy”, Shameen Pillai, Paolo Malinverno, Mark O'Neill, Jeremy D'Hoinne, May 18, 2020 (Gartner subscription required)

Categories: Technology

Announcing support for gRPC

Thu, 01/10/2020 - 14:00
Announcing support for gRPC

Today we're excited to announce beta support for proxying gRPC, a next-generation protocol that allows you to build APIs at scale. With gRPC on Cloudflare, you get access to the security, reliability and performance features that you're used to having at your fingertips for traditional APIs. Sign up for the beta today in the Network tab of the Cloudflare dashboard.

gRPC has proven itself to be a popular new protocol for building APIs at scale: it’s more efficient and built to offer superior bi-directional streaming capabilities. However, because gRPC uses newer technology, like HTTP/2, under the covers, existing security and performance tools did not support gRPC traffic out of the box. This meant that customers adopting gRPC to power their APIs had to pick between modernity on one hand, and things like security, performance, and reliability on the other. Because supporting modern protocols and making sure people can operate them safely and performantly is in our DNA, we set out to fix this.

When you put your gRPC APIs on Cloudflare, you immediately gain the benefits that come with Cloudflare. Apprehensive of exposing your APIs to bad actors? Add security features such as WAF and Bot Management. Need more performance? Turn on Argo Smart Routing to decrease time to first byte. Or increase reliability by adding a Load Balancer.

And naturally, gRPC plugs in to API Shield, allowing you to add more security by enforcing client authentication and schema validation at the edge.

What is gRPC?

Protocols like JSON-REST have been the bread and butter of Internet facing APIs for several years. They're great in that they operate over HTTP, their payloads are human readable, and a large body of tooling exists to quickly set up an API for another machine to talk to. However, the same things that make these protocols popular are also weaknesses; JSON, as an example, is inefficient to store and transmit, and expensive for computers to parse.

In 2015, Google introduced gRPC, a protocol designed to be fast and efficient, relying on binary protocol buffers to serialize messages before they are transferred over the wire. This prevents (normal) humans from reading them but results in much higher processing efficiency. gRPC has become increasingly popular in the era of microservices because it neatly addresses the shortfalls laid out above.

JSON Protocol Buffers { “foo”: “bar” } 0b111001001100001011000100000001100001010

gRPC relies on HTTP/2 as a transport mechanism. This poses a problem for customers trying to deploy common security technologies like web application firewalls, as most reverse proxy solutions (including Cloudflare’s HTTP stack, until today) downgrade HTTP requests down to HTTP/1.1 before sending them off to an origin.

Beyond microservices in a datacenter, the original use case for gRPC, adoption has grown in many other contexts. Many popular mobile apps have millions of users, that all rely on messages being sent back and forth between mobile phones and servers. We've seen many customers wire up API connectivity for their mobile apps by using the same gRPC API endpoints they already have inside their data centers for communication with clients in the outside world.

While this solves the efficiency issues with running services at scale, it exposes critical parts of these customers' infrastructure to the Internet, introducing security and reliability issues. Today we are introducing support for gRPC at Cloudflare, to secure and improve the experience of running gRPC APIs on the Internet.

How does gRPC + Cloudflare work?

The engineering work our team had to do to add gRPC support is composed of a few pieces:

  1. Changes to the early stages of our request processing pipeline to identify gRPC traffic coming down the wire.
  2. Additional functionality in our WAF to “understand” gRPC traffic, ensuring gRPC connections are handled correctly within the WAF, including inspecting all components of the initial gRPC connection request.
  3. Adding support to establish HTTP/2 connections to customer origins for gRPC traffic, allowing gRPC to be proxied through our edge. HTTP/2 to origin support is currently limited to gRPC traffic, though we expect to expand the scope of traffic proxied back to origin over HTTP/2 soon.

What does this mean for you, a Cloudflare customer interested in using our tools to secure and accelerate your API? Because of the hard work we’ve done, enabling support for gRPC is a click of a button in the Cloudflare dashboard.

Using gRPC to build mobile apps at scale

Why does Cloudflare supporting gRPC matter? To dig in on one use case, let’s look at mobile apps. Apps need quick, efficient ways of interacting with servers to get the information needed to show on your phone. There is no browser, so they rely on APIs to get the information. An API stands for application programming interface and is a standardized way for machines (say, your phone and a server) to talk to each other.

Let's say we're a mobile app developer with thousands, or even millions of users. With this many users, using a modern protocol, gRPC, allows us to run less compute infrastructure than would be necessary with older, less efficient protocols like JSON-REST. But exposing these endpoints, naked, on the Internet is really scary. Up until now there were very few, if any, options for protecting gRPC endpoints against application layer attacks with a WAF and guarding against volumetric attacks with DDoS mitigation tools. That changes today, with Cloudflare adding gRPC to it’s set of supported protocols.  

With gRPC on Cloudflare, you get the benefits of our security, reliability and performance products:

  • WAF for inspection of incoming gRPC requests. Use managed rules or craft your own.
  • Load Balancing to increase reliability: configure multiple gRPC backends to handle the load, let Cloudflare distribute the load across them. Backend selection can be done in round-robin fashion, based on health checks or load.
  • Argo Smart Routing to increase performance by transporting your gRPC messages faster than the Internet would be able to route them. Messages are routed around congestion on the Internet, resulting in an average reduction of time to first byte by 30%.

And of course, all of this works with API Shield, an easy way to add mTLS authentication to any API endpoint.

Enabling gRPC support

To enable gRPC support, head to the Cloudflare dashboard and go to the Network tab. From there you can sign up for the beta.

Announcing support for gRPC

We have limited seats available at launch, but will open up more broadly over the next few weeks. After signing up and toggling gRPC support, you’ll have to enable Cloudflare proxying on your domain on the DNS tab to activate Cloudflare for your gRPC API.

We’re excited to bring gRPC support to the masses, allowing you to add the security, reliability and performance benefits that you’re used to getting with Cloudflare. Enabling is just a click away. Take it for a spin and let us know what you think!

Categories: Technology

Introducing Cloudflare Radar

Wed, 30/09/2020 - 14:01
Introducing Cloudflare RadarIntroducing Cloudflare Radar

Unlike the tides, Internet use ebbs and flows with the motion of the sun not the moon. Across the world usage quietens during the night and picks up as morning comes. Internet use also follows patterns that humans create, dipping down when people stopped to applaud healthcare workers fighting COVID-19, or pausing to watch their country’s president address them, or slowing for religious reasons.

And while humans leave a mark on the Internet, so do automated systems. These systems might be doing useful work (like building search engine databases) or harm (like scraping content, or attacking an Internet property).

All the while Internet use (and attacks) is growing. Zoom into any day and you’ll see the familiar daily wave of Internet use reflecting day and night, zoom out and you’ll likely spot weekends when Internet use often slows down a little, zoom out further and you might spot the occasional change in use caused by a holiday, zoom out further and you’ll see that Internet use grows inexorably.

And attacks don’t only grow, they change. New techniques are invented while old ones remain evergreen. DDoS activity continues day and night roaming from one victim to another. Automated scanning tools look for vulnerabilities in anything, literally anything, connected to the Internet.

Sometimes the Internet fails in a country, perhaps because of a cable cut somewhere beneath the sea, or because of government intervention. That too is something we track and measure.

All this activity, good and bad, shows up in the trends and details that Cloudflare tracks to help improve our service and protect our customers. Until today this insight was only available internally at Cloudflare, today we are launching a new service, Cloudflare Radar, that shines a light on the Internet’s patterns.

Each second, Cloudflare handles on average 18 million HTTP requests and 6 million DNS requests. With 1 billion unique IP addresses connecting to Cloudflare’s network we have one of the most representative views on Internet traffic worldwide.

And by blocking 72 billion cyberthreats every day Cloudflare also has a unique position in understanding and mitigating Internet threats.

Our goal is to help build a better Internet and we want to do this by exposing insights, threats and trends based on the aggregated data that we have. We want to help anyone understand what is happening on the Internet from a security, performance and usage perspective. Every Internet user should have easy access to answer the questions that they have.

There are three key components that we’re launching today: Radar Internet Insights, Radar Domain Insights and Radar IP Insights.

Radar Internet Insights

At the top of Cloudflare Radar we show the latest news about events that are currently happening on the Internet. This includes news about the adoption of new technologies, browsers or operating systems. We are also keeping all users up to date with interesting events around developments in Internet traffic. This could be traffic patterns seen in specific countries or patterns related to events like the COVID-19 pandemic.

Introducing Cloudflare Radar

Sign up for Radar Alerts to always stay up-to-date.

Below the news section users can find rapidly updated trend data. All of which can be viewed worldwide or by country. The data is available for several time frames: last hour, last 24 hours, last 7 days. We'll soon make available the 30 days time frame to help explore longer term trends.

Change in Internet traffic

You can drill down on specific countries and Cloudflare Radar will show you the change in aggregate Internet traffic seen by our network for that country. We also show an info box on the right with a snapshot of interesting data points.

Introducing Cloudflare RadarMost popular and trending domains

Worldwide and for individual countries we have an algorithm calculating which domains are most popular and have recently started trending (i.e. have seen a large change in popularity). Services with multiple domains and subdomains are aggregated to ensure best comparability. We show here the relative rank of domains and are able to spot big changes in ranking to highlight new trends as they appear.

Introducing Cloudflare Radar

The trending domains section are still in beta as we are training our algorithm to best detect the next big things as they emerge.

There is also a search bar that enables a user to search for a specific domain or IP address to get detailed information about it. More on that below.

Attack activity

The attack activity section gives information about different types of cyberattacks observed by Cloudflare. First we show the attacks mitigated by our Layer 3 and 4 Denial of Service prevention systems. We show the used attack protocol as well as the change in attack volume over the selected time frame.

Introducing Cloudflare Radar

Secondly, we show Layer 7 threat information based on requests that we blocked. Layer 7 requests get blocked by a variety of systems (such as our WAF, our layer 7 DDoS mitigation system and our customer configurable firewall). We show the system responsible for blocking as well as the change of blocked requests over the selected time frame.

Introducing Cloudflare RadarTechnology Trends

Based on the analytics we handle on HTTP requests we are able to show trends over a diverse set of data points. This includes the distribution of mobile vs. desktop traffic, or the percentage of traffic detected as coming from bots. We also dig into longer term trends like the use of HTTPS or the share of IPv6.

Introducing Cloudflare Radar

The bottom section shows the top browsers worldwide or for the selected country. In this example we selected Vietnam and you can see that over 6% of users are using Cốc Cốc a local browser.

Introducing Cloudflare RadarRadar Domain Insights

We give users the option to dig in deeper on an individual domain. Giving the opportunity to get to know the global ranking as well as security information. This enables everyone to identify potential threats and risks.

To look up a domain or hostname in Radar by typing it in the search box within the top domains on the Radar Internet Insights Homepage.

Introducing Cloudflare Radar

For example, suppose you search for cloudflare.com. You’ll get sent to a domain-specific page with information about cloudflare.com.

Introducing Cloudflare Radar

At the top we provide an overview of the domain’s configuration with Domain Badges. From here you can, at a glance, understand what technologies the domain is using. For cloudflare.com you can see that it supports TLS, IPv6, DNSSEC and eSNI. There’s also an indication of the age of the domain (since registration) and its worldwide popularity.

Below you find the domain’s content categories. If you find a domain that is in the wrong category, please use our Domain Categorization Feedback to let us know.

We also show global popularity trends from our domain ranking formula. For domains with a global audience there’s also a map giving information about popularity by country.

Introducing Cloudflare RadarRadar IP Insights

For an individual IP address (instead of a domain) we show different information. To look up an IP address simply insert it in the search bar within the top domains on the Radar Internet Insights. For a quick lookup of your own IP just open radar.cloudflare.com/me.

Introducing Cloudflare Radar

For IPs we show the network (the ASN) and geographic information. For your own IP we also show more detailed location information as well as an invitation to check the speed of your Internet connection using speed.cloudflare.com.

Next Steps

The current product is just the beginning of Cloudflare’s approach to making knowledge about the Internet more accessible. Over the next few weeks and months we will add more data points and the 30 days time frame functionality.  And we’ll allow users to filter the charts not only by country but also by categorization (such as by industry).

Stay tuned for more to come.

Categories: Technology

Speeding up HTTPS and HTTP/3 negotiation with... DNS

Wed, 30/09/2020 - 14:00
Speeding up HTTPS and HTTP/3 negotiation with... DNS

In late June, Cloudflare's resolver team noticed a spike in DNS requests for the 65479 Resource Record thanks to data exposed through our new Radar service. We began investigating and found these to be a part of Apple’s iOS14 beta release where they were testing out a new SVCB/HTTPS record type.

Once we saw that Apple was requesting this record type, and while the iOS 14 beta was still on-going, we rolled out support across the Cloudflare customer base.

This blog post explains what this new record type does and its significance, but there’s also a deeper story: Cloudflare customers get automatic support for new protocols like this.

That means that today if you’ve enabled HTTP/3 on an Apple device running iOS 14, when it needs to talk to a Cloudflare customer (say you browse to a Cloudflare-protected website, or use an app whose API is on Cloudflare) it can find the best way of making that connection automatically.

And if you’re a Cloudflare customer you have to do… absolutely nothing… to give Apple users the best connection to your Internet property.

Negotiating HTTP security and performance

Whenever a user types a URL in the browser box without specifying a scheme (like “https://” or “http://”), the browser cannot assume, without prior knowledge such as a Strict-Transport-Security (HSTS) cache or preload list entry, whether the requested website supports HTTPS or not. The browser will first try to fetch the resources using plaintext HTTP, and only if the website redirects to an HTTPS URL, or if it specifies an HSTS policy in the initial HTTP response, the browser will then fetch the resource again over a secure connection.

Speeding up HTTPS and HTTP/3 negotiation with... DNS

This means that the latency incurred in fetching the initial resource (say, the index page of a website) is doubled, due to the fact that the browser needs to re-establish the connection over TLS and request the resource all over again. But worse still, the initial request is leaked to the network in plaintext, which could potentially be modified by malicious on-path attackers (think of all those unsecured public WiFi networks) to redirect the user to a completely different website. In practical terms, this weakness is sometimes used by said unsecured public WiFi network operators to sneak advertisements into people’s browsers.

Unfortunately, that’s not the full extent of it. This problem also impacts HTTP/3, the newest revision of the HTTP protocol that provides increased performance and security. HTTP/3 is advertised using the Alt-Svc HTTP header, which is only returned after the browser has already contacted the origin using a different and potentially less performant HTTP version. The browser ends up missing out on using faster HTTP/3 on its first visit to the website (although it does store the knowledge for later visits).

Speeding up HTTPS and HTTP/3 negotiation with... DNS

The fundamental problem comes from the fact that negotiation of HTTP-related parameters (such as whether HTTPS or HTTP/3 can be used) is done through HTTP itself (either via a redirect, HSTS and/or Alt-Svc headers). This leads to a chicken and egg problem where the client needs to use the most basic HTTP configuration that has the best chance of succeeding for the initial request. In most cases this means using plaintext HTTP/1.1. Only after it learns of parameters can it change its configuration for the following requests.

But before the browser can even attempt to connect to the website, it first needs to resolve the website’s domain to an IP address via DNS. This presents an opportunity: what if additional information required to establish a connection could be provided, in addition to IP addresses, with DNS?

That’s what we’re excited to be announcing today: Cloudflare has rolled out initial support for HTTPS records to our edge network. Cloudflare’s DNS servers will now automatically generate HTTPS records on the fly to advertise whether a particular zone supports HTTP/3 and/or HTTP/2, based on whether those features are enabled on the zone.

Service Bindings via DNS

The new proposal, currently discussed by the Internet Engineering Task Force (IETF) defines a family of DNS resource record types (“SVCB”) that can be used to negotiate parameters for a variety of application protocols.

The generic DNS record “SVCB” can be instantiated into records specific to different protocols. The draft specification defines one such instance called “HTTPS”, specific to the HTTP protocol, which can be used not only to signal to the client that it can connect in over a secure connection (skipping the initial unsecured request), but also to advertise the different HTTP versions supported by the website. In the future, potentially even more features could be advertised.

example.com 3600 IN HTTPS 1 . alpn=”h3,h2”

The DNS record above advertises support for the HTTP/3 and HTTP/2 protocols for the example.com origin.

This is best used alongside DNS over HTTPS or DNS over TLS, and DNSSEC, to again prevent malicious actors from manipulating the record.

The client will need to fetch not only the typical A and AAAA records to get the origin’s IP addresses, but also the HTTPS record. It can of course do these lookups in parallel to avoid additional latency at the start of the connection, but this could potentially lead to A/AAAA and HTTPS responses diverging from each other. For example, in cases where the origin makes use of DNS load-balancing: if an origin can be served by multiple CDNs it might happen that the responses for A and/or AAAA records come from one CDN, while the HTTPS record comes from another. In some cases this can lead to failures when connecting to the origin (say, if the HTTPS record from one of the CDNs advertises support for HTTP/3, but the CDN the client ends up connecting to doesn’t support it).

This is solved by the SVCB and HTTPS records by providing the IP addresses directly, without the need for the client to look at A and AAAA records. This is done via the “ipv4hint” and “ipv6hint” parameters that can optionally be added to these records, which provide lists of IPv4 and IPv6 addresses that can be used by the client in lieu of the addresses specified in A and AAAA records. Of course clients will still need to query the A and AAAA records, to support cases where no SVCB or HTTPS record is available, but these IP hints provide an additional layer of robustness.

example.com 3600 IN HTTPS 1 . alpn=”h3,h2” ipv4hint=”192.0.2.1” ipv6hint=”2001:db8::1”

In addition to all this, SVCB and HTTPS can also be used to define alternative endpoints that are authoritative for a service, in a similar vein to SRV records:

example.com 3600 IN HTTPS 1 example.net alpn=”h3,h2” example.com 3600 IN HTTPS 2 example.org alpn=”h2”

In this case the “example.com” HTTPS service can be provided by both “example.net” (which supports both HTTP/3 and HTTP/2, in addition to HTTP/1.x) as well as “example.org” (which only supports HTTP/2 and HTTP/1.x). The client will first need to fetch A and AAAA records for “example.net” or “example.org” before being able to connect, which might increase the connection latency, but the service operator can make use of the IP hint parameters discussed above in this case as well, to reduce the amount of required DNS lookups the client needs to perform.

This means that SVCB and HTTPS records might finally provide a way for SRV-like functionality to be supported by popular browsers and other clients that have historically not supported SRV records.

There is always room at the top apex

When setting up a website on the Internet, it’s common practice to use a “www” subdomain (like in “www.cloudflare.com”) to identify the site, as well as the “apex” (or “root”) of the domain (in this case, “cloudflare.com”). In order to avoid duplicating the DNS configuration for both domains, the “www” subdomain can typically be configured as a CNAME (Canonical Name) record, that is, a record that maps to a different DNS record.

cloudflare.com. 3600 IN A 192.0.2.1 cloudflare.com. 3600 IN AAAA 2001:db8::1 www 3600 IN CNAME cloudflare.com.

This way the list of IP addresses of the websites won’t need to be duplicated all over again, but clients requesting A and/or AAAA records for “www.cloudflare.com” will still get the same results as “cloudflare.com”.

However, there are some cases where using a CNAME might seem like the best option, but ends up subtly breaking the DNS configuration for a website. For example when setting up services such as GitLab Pages, GitHub Pages or Netlify with a custom domain, the user is generally asked to add an A (and sometimes AAAA) record to the DNS configuration for their domain. Those IP addresses are hard-coded in users’ configurations, which means that if the provider of the service ever decides to change the addresses (or add new ones), even if just to provide some form of load-balancing, all of their users will need to manually change their configuration.

Using a CNAME to a more stable domain which can then have variable A and AAAA records might seem like a better option, and some of these providers do support that, but it’s important to note that this generally only works for subdomains (like “www” in the previous example) and not apex records. This is because the DNS specification that defines CNAME records states that when a CNAME is defined on a particular target, there can’t be any other records associated with it. This is fine for subdomains, but apex records will need to have additional records defined, such as SOA and NS, for the DNS configuration to work properly and could also have records such as MX to make sure emails get properly delivered. In practical terms, this means that defining a CNAME record at the apex of a domain might appear to be working fine in some cases, but be subtly broken in ways that are not immediately apparent.

But what does this all have to do with SVCB and HTTPS records? Well, it turns out that those records can also solve this problem, by defining an alternative format called “alias form” that behaves in the same manner as a CNAME in all the useful ways, but without the annoying historical baggage. A domain operator will be able to define a record such as:

example.com. 3600 IN HTTPS example.org.

and expect it to work as if a CNAME was defined, but without the subtle side-effects.

One more thing

Encrypted SNI is an extension to TLS intended to improve privacy of users on the Internet. You might remember how it makes use of a custom DNS record to advertise the server’s public key share used by clients to then derive the secret key necessary to actually encrypt the SNI. In newer revisions of the specification (which is now called “Encrypted ClientHello” or “ECH”) the custom TXT record used previously is simply replaced by a new parameter, called “echconfig”, for the SVCB and HTTPS records.

This means that SVCB/HTTPS are a requirement to support newer revisions of Encrypted SNI/Encrypted ClientHello. More on this later this year.

Speeding up HTTPS and HTTP/3 negotiation with... DNSWhat now?

This all sounds great, but what does it actually mean for Cloudflare customers? As mentioned earlier, we have enabled initial support for HTTPS records across our edge network. Cloudflare’s DNS servers will automatically generate HTTPS records on the fly to advertise whether a particular zone supports HTTP/3 and/or HTTP/2, based on whether those features are enabled on the zone, and we will later also add Encrypted ClientHello support.

Thanks to Cloudflare’s large network that spans millions of web properties (we happen to be one of the most popular DNS providers), serving these records on our customers' behalf will help build a more secure and performant Internet for anyone that is using a supporting client.

Adopting new protocols requires cooperation between multiple parties. We have been working with various browsers and clients to increase the support and adoption of HTTPS records. Over the last few weeks, Apple’s iOS 14 release has included client support for HTTPS records, allowing connections to be upgraded to QUIC when the HTTP/3 parameter is returned in the DNS record. Apple has reported that so far, of the population that has manually enabled HTTP/3 on iOS 14, 8% of the QUIC connections had the HTTPS record response.

Speeding up HTTPS and HTTP/3 negotiation with... DNS

Other browser vendors, such as Google and Mozilla, are also working on shipping support for HTTPS records to their users, and we hope to be hearing more on this front soon.

Categories: Technology

Free, Privacy-First Analytics for a Better Web

Tue, 29/09/2020 - 14:02
Free, Privacy-First Analytics for a Better Web

Everyone with a website needs to know some basic facts about their website: what pages are people visiting? Where in the world are they? What other sites sent traffic to my website?

There are “free” analytics tools out there, but they come at a cost: not money, but your users’ privacy. Today we’re announcing a brand new, privacy-first analytics service that’s open to everyone — even if they're not already a Cloudflare customer. And if you're a Cloudflare customer, we've enhanced our analytics to make them even more powerful than before.

The most important analytics feature: Privacy

The most popular analytics services available were built to help ad-supported sites sell more ads. But, a lot of websites don’t have ads. So if you use those services, you're giving up the privacy of your users in order to understand how what you've put online is performing.

Cloudflare's business has never been built around tracking users or selling advertising. We don’t want to know what you do on the Internet — it’s not our business. So we wanted to build an analytics service that gets back to what really matters for web creators, not necessarily marketers, and to give web creators the information they need in a simple, clean way that doesn't sacrifice their visitors' privacy. And giving web creators these analytics shouldn’t depend on their use of Cloudflare’s infrastructure for performance and security. (More on that in a bit.)

What does it mean for us to make our analytics “privacy-first”? Most importantly, it means we don’t need to track individual users over time for the purposes of serving analytics. We don’t use any client-side state, like cookies or localStorage, for the purposes of tracking users. And we don’t “fingerprint” individuals via their IP address, User Agent string, or any other data for the purpose of displaying analytics. (We consider fingerprinting even more intrusive than cookies, because users have no way to opt out.)

Counting visits without tracking users

One of the most essential stats about any website is: “how many people went there”? Analytics tools frequently show counts of “unique” visitors, which requires tracking individual users by a cookie or IP address.

We use the concept of a visit: a privacy-friendly measure of how people have interacted with your website. A visit is defined simply as a successful page view that has an HTTP referer that doesn’t match the hostname of the request. This tells you how many times people came to your website and clicked around before navigating away, but doesn’t require tracking individuals.

Free, Privacy-First Analytics for a Better Web

A visit has slightly different semantics from a “unique”, and you should expect this number to differ from other analytics tools.

All of the details, none of the bots

Our analytics deliver the most important metrics about your website, like page views and visits. But we know that an essential analytics feature is flexibility: the ability to add arbitrary filters, and slice-and-dice data as you see fit. Our analytics can show you the top hostnames, URLs, countries, and other critical metrics like status codes. You can filter on any of these metrics with a click and see the whole dashboard update.

I’m especially excited about two features in our time series charts: the ability to drag-to-zoom into a narrower time range, and the ability to “group by” different dimensions to see data in a different way. This is a super powerful way to drill into an anomaly in traffic and quickly see what’s going on. For example, you might notice a spike in traffic, zoom into that spike, and then try different groupings to see what contributed the extra clicks. A GIF is worth a thousand words:


And for customers of our Bot Management product, we’re working on the ability to detect (and remove) automated traffic. Coming very soon, you’ll be able to see which bots are reaching your website -- with just a click, block them by using Firewall Rules.

This is all possible thanks to our ABR analytics technology, which enables us to serve analytics very quickly for websites large and small. Check out our blog post to learn more about how this works.

Edge or Browser analytics? Why not both?

There are two ways to collect web analytics data: at the edge (or on an origin server), or in the client using a JavaScript beacon.

Historically, Cloudflare has collected analytics data at our edge. This has some nice benefits over traditional, client-side analytics approaches:

  • It’s more accurate because you don’t miss users who block third-party scripts, or JavaScript altogether
  • You can see all of the traffic back to your origin server, even if an HTML page doesn’t load
  • We can detect (and block) bots, apply Firewall rules, and generally scrub traffic of unwanted noise
  • You can measure the performance of your origin server

More commonly, most web analytics providers use client-side measurement. This has some benefits as well:

  • You can understand performance as your users see it -- e.g. how long did the page actually take to render
  • You can detect errors in client-side JavaScript execution
  • You can define custom event types emitted by JavaScript frameworks

Ultimately, we want our customers to have the best of both worlds. We think it’s really powerful to get web traffic numbers directly from the edge. We also launched Browser Insights a year ago to augment our existing edge analytics with more performance information, and today Browser Insights are taking a big step forward by incorporating Web Vitals metrics.

But, we know not everyone can modify their DNS to take advantage of Cloudflare’s edge services. That’s why today we’re announcing a free, standalone analytics product for everyone.

How do I get it?

For existing Cloudflare customers on our Pro, Biz, and Enterprise plans, just go to your Analytics tab! Starting today, you’ll see a banner to opt-in to the new analytics experience. (We plan to make this the default in a few weeks.)

But when building privacy-first analytics, we realized it’s important to make this accessible even to folks who don’t use Cloudflare today. You’ll be able to use Cloudflare’s web analytics even if you can’t change your DNS servers -- just add our JavaScript, and you’re good to go.

We’re still putting on the finishing touches on our JavaScript-based analytics, but you can sign up here and we’ll let you know when it’s ready.

The evolution of analytics at Cloudflare

Just over a year ago, Cloudflare’s analytics consisted of a simple set of metrics: cached vs uncached data transfer, or how many requests were blocked by the Firewall. Today we provide flexible, powerful analytics across all our products, including Firewall, Cache, Load Balancing and Network traffic.

While we’ve been focused on building analytics about our products, we realized that our analytics are also powerful as a standalone product. Today is just the first step on that journey. We have so much more planned: from real-time analytics, to ever-more performance analysis, and even allowing customers to add custom events.

We want to hear what you want most out of analytics — drop a note in the comments to let us know what you want to see next.

Categories: Technology

Start measuring Web Vitals with Browser Insights

Tue, 29/09/2020 - 14:00
Start measuring Web Vitals with Browser Insights

Many of us at Cloudflare obsess about how to make websites faster. But to improve performance, you have to measure it first. Last year we launched Browser Insights to help our customers measure web performance from the perspective of end users.

Today, we're partnering with the Google Chrome team to bring Web Vitals measurements into Browser Insights. Web Vitals are a new set of metrics to help web developers and website owners measure and understand load time, responsiveness, and visual stability. And with Cloudflare’s Browser Insights, they’re easier to measure than ever – and it’s free for anyone to collect data from the whole web.

Start measuring Web Vitals with Browser InsightsWhy do we need Web Vitals?

When trying to understand performance, it’s tempting to focus on the metrics that are easy to measure — like Time To First Byte (TTFB). While TTFB and similar metrics are important to understand, we’ve learned that they don’t always tell the whole story.

Our partners on the Google Chrome team have tackled this problem by breaking down user experience into three components:

  • Loading: How long did it take for content to become available?
  • Interactivity: How responsive is the website when you interact with it?
  • Visual stability: How much does the page move around while loading? (I think of this as the inverse of “jankiness”)
Start measuring Web Vitals with Browser InsightsThis image is reproduced from work created and shared by Google and used according to terms described in the Creative Commons 4.0 Attribution License.

It’s challenging to create a single metric that captures these high-level components. Thankfully, the folks at Google Chrome team have thought about this, and earlier this year introduced three “Core” Web Vitals metrics:  Largest Contentful Paint,  First Input Delay, and Cumulative Layout Shift.

How do Web Vitals help make your website faster?

Measuring the Core Web Vitals isn’t the end of the story. Rather, they’re a jumping off point to understand what factors impact a website’s performance. Web Vitals tells you what is happening at a high level, and other more detailed metrics help you understand why user experience could be slow.

Take loading time, for example. If you notice that your Largest Contentful Paint score is “needs improvement”, you want to dig into what is taking so long to load! Browser Insights still measures navigation timing metrics like DNS lookup time and TTFB. By analyzing these metrics in turn, you might want to dig further into optimizing cache hit rates, tuning the performance of your origin server, or tweaking order in which resources like JavaScript and CSS load.

Start measuring Web Vitals with Browser Insights

For more information about improving web performance, check out Google’s guides to improving LCP, FID, and CLS.

Why measure Web Vitals with Cloudflare?

First, we think that RUM (Real User Measurement) is a critical companion to synthetic measurement. While you can always try a few page loads on your own laptop and see the results, gathering data from real users is the only way to take into account real-life device performance and network conditions.

There are other great RUM tools out there. Google’s Chrome User Experience Report (CrUX) collects data about the entire web and makes it available through tools like Page Speed Insights (PSI), which combines synthetic and RUM results into useful diagnostic information.

One major benefit of Cloudflare’s Browser Insights is that it updates constantly; new data points are available shortly after seeing a request from an end-user. The data in the Chrome UX Report is a 28-day rolling average of aggregated metrics, so you need to wait until you can see changes reflected in the data.

Another benefit of Browser Insights is that we can measure any browser — not just Chrome. As of this writing, the APIs necessary to report Web Vitals are only supported in Chromium browsers, but we'll support Safari and Firefox when they implement those APIs.

Finally, Brower Insights is free to use! We’ve worked really hard to make our analytics blazing fast for websites with any amount of traffic. We’re excited to support slicing and grouping by URL, Browser, OS, and Country, and plan to support several more dimensions soon.

Push a button to start measuring

To start using Browser Insights, just head over to the Speed tab in the dashboard. Starting today, Web Vitals metrics are now available for everyone!

Behind the scenes, Browser Insights works by inserting a JavaScript "beacon" into HTML pages. You can control where the beacon loads if you only want to measure specific pages or hostnames. If you’re using CSP version 3, we’ll even automatically detect the nonce (if present) and add it to the script.

Where we’ve been, and where we're going

We've been really proud of the success of Browser Insights. We've been hard at work over the last year making lots of improvements — for example, we've made the dashboard fast and responsive (and still free!) even for the largest websites.

Coming soon, we’re excited to make this available for all our Web Analytics customers — even those who don’t use Cloudflare today. We’re also hard at work adding much-requested features like client-side error reporting, and diagnostics tools to make it easier to understand where to improve.

Categories: Technology

Explaining Cloudflare's ABR Analytics

Tue, 29/09/2020 - 14:00
Explaining Cloudflare's ABR Analytics

Cloudflare’s analytics products help customers answer questions about their traffic by analyzing the mind-boggling, ever-increasing number of events (HTTP requests, Workers requests, Spectrum events) logged by Cloudflare products every day.  The answers to these questions depend on the point of view of the question being asked, and we’ve come up with a way to exploit this fact to improve the quality and responsiveness of our analytics.

Useful Accuracy

Consider the following questions and answers:

What is the length of the coastline of Great Britain? 12.4K km
What is the total world population? 7.8B
How many stars are in the Milky Way? 250B
What is the total volume of the Antarctic ice shelf? 25.4M km3
What is the worldwide production of lentils? 6.3M tonnes
How many HTTP requests hit my site in the last week? 22.6M

Useful answers do not benefit from being overly exact.  For large quantities, knowing the correct order of magnitude and a few significant digits gives the most useful answer.  At Cloudflare, the difference in traffic between different sites or when a single site is under attack can cross nine orders of magnitude and, in general, all our traffic follows a Pareto distribution, meaning that what’s appropriate for one site or one moment in time might not work for another.

Explaining Cloudflare's ABR Analytics

Because of this distribution, a query that scans a few hundred records for one customer will need to scan billions for another.  A report that needs to load a handful of rows under normal operation might need to load millions when a site is under attack.

To get a sense of the relative difference of each of these numbers, remember “Powers of Ten”, an amazing visualization that Ray and Charles Eames produced in 1977.  Notice that the scale of an image determines what resolution is practical for recording and displaying it.

Explaining Cloudflare's ABR AnalyticsUsing ABR to determine resolution

This basic fact informed our design and implementation of ABR for Cloudflare analytics.  ABR stands for “Adaptive Bit Rate”.  It’s essentially an eponym for the term as used in video streaming such as Cloudflare’s own Stream Delivery.  In those cases, the server will select the best resolution for a video stream to match your client and network connection.

In our case, every analytics query that supports ABR will be calculated at a resolution matching the query.  For example, if you’re interested to know from which country the most firewall events were generated in the past week, the system might opt to use a lower resolution version of the firewall data than if you had opted to look at the last hour. The lower resolution version will provide the same answer but take less time and fewer resources.  By using multiple, different resolutions of the same data, our analytics can provide consistent response times and a better user experience.

You might be aware that we use a columnar store called ClickHouse to store and process our analytics data.  When using ABR with ClickHouse, we write the same data at multiple resolutions into separate tables.  Usually, we cover seven orders of magnitude – from 100% to 0.0001% of the original events.  We wind up using an additional 12% of disk storage but enable very fast ad hoc queries on the reduced resolution tables.

Explaining Cloudflare's ABR AnalyticsAggregations and Rollups

The ABR technique facilitates aggregations by making compact estimates of every dimension.  Another way to achieve the same ends is with a system that computes “rollups”.  Rollups save space by computing either complete or partial aggregations of the data as it arrives.  

For example, suppose we wanted to count a total number of lentils. (Lentils are legumes and among the oldest and most widely cultivated crops.  They are a staple food in many parts of the world.)  We could just count each lentil as it passed through the processing system. Of course because there a lot of lentils, that system is distributed – meaning that there are hundreds of separate machines.  Therefore we’ll actually have hundreds of separate counters.

Also, we’ll want to include more information than just the count, so we’ll also include the weight of each lentil and maybe 10 or 20 other attributes. And of course, we don’t want just a total for each attribute, but we’ll want to be able to break it down by color, origin, distributor and many other things, and also we’ll want to break these down by slices of time.

In the end, we’ll have tens of thousands or possibly millions of aggregations to be tabulated and saved every minute.  These aggregations are expensive to compute, especially when using aggregations more complicated than simple counters and sums.  They also destroy some information.  For example, once we’ve processed all the lentils through the rollups, we can’t say for sure that we’ve counted them all, and most importantly, whichever attributes we neglected to aggregate are unavailable.

The number we’re counting, 6.3M tonnes, only includes two significant digits which can easily be achieved by counting a sample.  Most of the rollup computations used on each lentil (on the order 1013 to account for 6.3M tonnes) are wasted.

Other forms of aggregations

So far, we’ve discussed ABR and its application to aggregations, but we’ve only given examples involving “counts” and “sums”.  There are other, more complex forms of aggregations we use quite heavily.  Two examples are “topK” and “count-distinct”.

A “topK” aggregation attempts to show the K most frequent items in a set.  For example, the most frequent IP address, or country.  To compute topK, just count the frequency of each item in the set and return the K items with the highest frequencies. Under ABR, we compute topK based on the set found in the matching resolution sample. Using a sample makes this computation a lot faster and less complex, but there are problems.

The estimate of topK derived from a sample is biased and dependent on the distribution of the underlying data. This can result in overestimating the significance of elements in the set as compared to their frequency in the full set. In practice this effect can only be noticed when the cardinality of the set is very high and you’re not going to notice this effect on a Cloudflare dashboard.  If your site has a lot of traffic and you’re looking at the Top K URLs or browser types, there will be no difference visible at different resolutions.  Also keep in mind that as long as we’re estimating the “proportion” of the element in the set and the set is large, the results will be quite accurate.

The other fascinating aggregation we support is known as “count-distinct”, or number of uniques.  In this case we want to know the number of unique values in a set.  For example, how many unique cache keys have been used.  We can safely say that a uniform random sample of the set cannot be used to estimate this number.  However, we do have a solution.

We can generate another, alternate sample based on the value in question.  For example, instead of taking a random sample of all requests, we take a random sample of IP addresses.  This is sometimes called distinct reservoir sampling, and it allows us to estimate the true number of distinct IPs based on the cardinality of the sampled set. Again, there are techniques available to improve these estimates, and we’ll be implementing some of those.

ABR improves resilience and scalability

Using ABR saves us resources.  Even better, it allows us to query all the attributes in the original data, not just those included in rollups.  And even better, it allows us to check our assumptions against different sample intervals in separate tables as a check that the system is working correctly, because the original events are preserved.

However, the greatest benefits of employing ABR are the ones that aren’t directly visible. Even under ideal conditions, a large distributed system such as Cloudflare’s data pipeline is subject to high tail latency.  This occurs when any single part of the system takes longer than usual for any number of a long list of reasons.  In these cases, the ABR system will adapt to provide the best results available at that moment in time.

For example, compare this chart showing Cache Performance for a site under attack with the same chart generated a moment later while we simulate a failure of some of the servers in our cluster.  In the days before ABR, your Cloudflare dashboard would fail to load in this scenario.  Now, with ABR analytics, you won’t see significant degradation.

Explaining Cloudflare's ABR AnalyticsExplaining Cloudflare's ABR Analytics

Stretching the analogy to ABR in video streaming, we want you to be able to enjoy your analytics dashboard without being bothered by issues related to faulty servers, or network latency, or long running queries.  With ABR you can get appropriate answers to your questions reliably and within a predictable amount of time.

In the coming months, we’re going to be releasing a variety of new dashboards and analytics products based on this simple but profound technology.  Watch your Cloudflare dashboard for increasingly useful and interactive analytics.

Categories: Technology

Birthday Week on Cloudflare TV: 24 Hours of Live Discussions on the Future of the Internet

Mon, 28/09/2020 - 18:01
 24 Hours of Live Discussions on the Future of the Internet

This week marks Cloudflare’s 10th birthday, and we’re excited to continue our annual tradition of launching an array of products designed to help give back to the Internet. (Check back here each morning for the latest!)

We also see this milestone as an opportunity to reflect on where the Internet was ten years ago, and where it might be headed over the next decade. So we reached out to some of the people we respect most to see if they’d be interested in joining us for a series of Fireside Chats on Cloudflare TV.

We’ve been blown away by the response, and are thrilled to announce our lineup of speakers, featuring many of the most celebrated names in tech and beyond. Among the highlights: Apple co-founder Steve Wozniak, Zoom CEO Eric Yuan, OpenTable CEO Debby Soo, Stripe co-founder and President John Collison, Former CEO & Executive Chairman of Google and Co-Founder of Schmidt Futures Eric Schmidt, former McAfee CEO Chris Young, former Seal Team 6 Commander Dave Cooper, Project Include CEO Ellen Pao, and so many more. All told, we have over 24 hours of live discussions scheduled throughout the week.

To tune in, just head to Cloudflare TV (no registration required). You can view the details for each session by clicking the links below, where you’ll find handy Watch Now buttons to make sure you don’t miss anything. We’ll also be rebroadcasting these talks throughout the week, so they’ll be easily accessible in different timezones.

A tremendous thank you to everyone on this list for helping us celebrate Cloudflare’s 10th annual Birthday Week!

 24 Hours of Live Discussions on the Future of the InternetJay Adelson

Founder of Equinix and Chairman & Co-Founder of Scorbit
Aired live on Thursday, October 1, 10:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetShellye Archambeau

Fortune 500 Board Member and Author & Former CEO of MetricStream
Aired live on Thursday, October 1, 6:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetAbhinav Asthana

Founder & CEO of Postman
Aired live on Wednesday, September 30, 3:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetAzeem Azhar

Founder of Exponential View
Aired live on Friday, October 2, 9:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetJohn Battelle

Co-Founder & CEO of Recount Media
Aired live on Wednesday, September 30, 8:30 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetChristian Beedgen

CTO & Co-Founder of SumoLogic
Aired live on Thursday, October 1, 7:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetGleb Budman

CEO & Co-Founder of Backblaze
Aired live on Wednesday, September 30, 2:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetHayden Brown

CEO of Upwork
Aired live on Wednesday, September 30, 5:00pm (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetStewart Butterfield

CEO of Slack
Aired live on Thursday, October 1, 8:30 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetJohn P. Carlin

Former Assistant Attorney General for the US Department of Justice’s National Security Division and current Chair of Morrison & Foerster’s Global Risk + Crisis Management practice
Aired live on Tuesday, September 29, 12:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetJohn Collison

Co-Founder & President of Stripe
Aired live on Friday, October 2, 3:00 PM (PDT) // Watch Now


 24 Hours of Live Discussions on the Future of the Internet
Dave Cooper

Former Seal Team 6 Commander
Aired live on Tuesday, September 29, 10:30 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetScott Galloway

Founder & Chair of L2
Aired live on Wednesday September 30th, 12PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetKara Goldin

Founder & CEO of Hint Inc.
Aired live on Thursday, October 1, 12:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetDavid Gosset

Founder of Europe China Forum
Aired live on Monday September 28th, 5:00PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetJon Green

VP and Chief Technologist for Security at Aruba, a Hewlett Packard Enterprise company
Aired live on Wednesday, September 30th, 9:00AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetArvind Gupta

Former CEO of MyGov, Govt. of India and current Head & Co-Founder of Digital India Foundation
Aired live on Monday, September 28, 8:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetAnu Hariharan

Partner at Y Combinator
Aired live on Monday, September 28, 1:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetBrett Hautop

VP of Global Design + Build at LinkedIn
Aired live on Friday, October 2, 11:00 AM (PDT) // Watch Now


 24 Hours of Live Discussions on the Future of the InternetErik Hersman

CEO of BRCK
Aired live on Friday, October 2, 4:30am (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetJennifer Hyman

CEO & Co-Founder of Rent the Runway
Aired live on Wednesday, September 30, 1:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetPaul Judge

Co-Founder & Partner of TechSquare Labs and Co-Founder & Executive Chairman of Pindrop
Aired live on Wednesday, September 30, 8:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetDavid Kaye

Former UN Special Rapporteur
Aired live on Thursday, October 1,  8:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetPam Kostka

CEO of All Raise
Aired live on Thursday, October 1, 1:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetRaffi Krikorian

Managing Director at Emerson Collective and former Engineering Executive at Twitter & Uber
Aired live on Friday, October 2, 1:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetAlbert Lee

Co-Founder of MyFitnessPal
Aired live on Monday, September 28, 12:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetAaron Levie

CEO & Co-Founder of Box
Aired live on Thursday, October 1, 4:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetAlexander Macgillivray

Co-Founder & GC of Alloy and former Deputy CTO of US Government
Aired live on Thursday, October 1, 11:30 AM (PDT) // Coming soon

 24 Hours of Live Discussions on the Future of the InternetEllen Pao

Former CEO of Reddit and current CEO of Project Include
Aired live on Tuesday, September 29, 2:00 PM (PDT) // Coming soon

 24 Hours of Live Discussions on the Future of the InternetKeith Rabois

General Partner at Founders Fund and former COO of Square
Aired live on Wednesday, September 30, 3:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetEric Schmidt

Former CEO of Google and current Technical Advisor at Alphabet, Inc.
Aired live on Monday, September 28, 12:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetPradeep Sindhu

Founder & Chief Scientist at Juniper Networks, and Founder & CEO at Fungible
Aired live on Wednesday, September 30, 11:30 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetKaran Singh

Co-Founder & Chief Operating Officer of Ginger
Aired live on Monday, September 28, 3:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetDebby Soo

CEO of OpenTable and former Chief Commercial Officer of KAYAK
Aired live on Wednesday September 30, 12:30 PT (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetDan Springer

CEO of DocuSign
Aired live on Thursday, October 1, 1:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetBonita Stewart

Vice President, Global Partnerships & Americas Partnerships Solutions at Google
Aired live on Friday, October 2, 2:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetHemant Taneja

Managing Director at General Catalyst
Aired live on Friday, October 2nd, 4:00PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetBret Taylor

President & Chief Operating Officer of Salesforce
Aired live on Friday, October 2, 12:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetRobert Thomson

Chief Executive at News Corp and former Editor-in-Chief at The Wall Street Journal & Dow Jones
Aired live on Thursday, October 1, 12:00 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetRobin Thurston

Founder & CEO of Pocket Outdoor Media and former EVP, Chief Digital Officer of Under Armour
Aired live on Monday, September 28, 12:00 PM (PDT) //Watch Now

 24 Hours of Live Discussions on the Future of the InternetSelina Tobaccowala

Chief Digital Officer at Openfit, Co-Founder of Gixo, and former President & CTO of SurveyMonkey
Aired live on Thursday, October 1, 9:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetBen Wizner

Director of ACLU's Speech, Privacy, & Technology Project
Aired live on Thursday, October 1, 10:30 AM (PDT) // Coming Soon

 24 Hours of Live Discussions on the Future of the InternetMichael Wolf

Founder & CEO of Activate and former President and Chief Operating Officer of MTV Networks
Aired live on Tuesday, September 29, 3:30 PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetJosh Wolfe

Co-Founder and Managing Partner of Lux Capital
Aired live on Friday, October 2, 1:00PM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetSteve Wozniak

Co-Founder of Apple, Inc.
Aired live on Wednesday, September 30, 10:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetChris Young

Former CEO of McAfee
Aired live on Thursday, October 1, 11:00 AM (PDT) // Watch Now

 24 Hours of Live Discussions on the Future of the InternetEric Yuan

Founder & Chief Executive Officer of Zoom
Aired live on Monday, September 28, 3:30 PM (PDT) // Watch Now

Categories: Technology

Making Time for Cron Triggers: A Look Inside

Mon, 28/09/2020 - 14:02
 A Look Inside A Look Inside

Today, we are excited to launch Cron Triggers to the Cloudflare Workers serverless compute platform. We’ve heard the developer feedback, and we want to give our users the ability to run a given Worker on a scheduled basis. In case you’re not familiar with Unix systems, the cron pattern allows developers to schedule jobs to run at fixed intervals. This pattern is ideal for running any types of periodic jobs like maintenance or calling third party APIs to get up-to-date data. Cron Triggers has been a highly requested feature even inside Cloudflare and we hope that you will find this feature as useful as we have!

 A Look InsideWhere are Cron Triggers going to be run?

Cron Triggers are executed from the edge. At Cloudflare, we believe strongly in edge computing and wanted our new feature to get all of the performance and reliability benefits of running on our edge. Thus, we wrote a service in core that is responsible for distributing schedules to a new edge service through Quicksilver which will then trigger the Workers themselves.

What’s happening under the hood?

At a high level, schedules created through our API create records in our database with the information necessary to execute the Worker and the given cron schedule. These records are then picked up by another service which continuously evaluates the state of our edge and distributes the schedules between cities.

Once the schedules have been distributed to the edge, a service running in the edge node polls for changes to the schedules and makes sure they get sent to our runtime at the appropriate time.

New Event Type A Look Inside

Cron Triggers gave us the opportunity to finally recognize a new Worker 'type' in our API. While Workers currently only run on web requests, we have lots of ideas for the future of edge computing that aren’t strictly tied to HTTP requests and responses. Expect to see even more new handlers in the future for other non-HTTP events like log information from your Worker (think custom wrangler tail!) or even TCP Workers.

Here’s an example of the new Javascript API:

addEventListener('scheduled', event => { event.waitUntil(someAsyncFunction(event)) })

Where event has the following interface in Typescript:

interface ScheduledEvent { type: 'scheduled'; scheduledTime: int; // milliseconds since the Unix epoch }

As long as your Worker has a handler for this new event type, you’ll be able to give it a schedule.

New APIsPUT /client/v4/accounts/:account_identifier/workers/scripts/:name

The script upload API remains the same, but during script validation we now detect and return the registered event handlers.

PUT /client/v4/accounts/:account_identifier/workers/scripts/:name/schedulesBody[ {"cron": "* * * * *"}, ... ]

This will create or modify all schedules for a script, removing all schedules not in the list. For now, there’s a limit of 3 distinct cron schedules. Schedules can be set to run as often as one minute and don’t accept schedules with years in them (sorry, you’ll have to run your Y3K migration script another way).

GET /client/v4/accounts/:account_identifier/workers/scripts/:name/schedulesResponse{ "schedules": [ { "cron": "* * * * *", "created_on": <time>, "modified_on": <time> }, ... ] }

The Scheduler service is responsible for reading the schedules from Postgres and generating per-node schedules to place into Quicksilver. For now, the service simply avoids trying to execute your Worker on an edge node that may be disabled for some reason, but such an approach also gives us a lot of flexibility in deciding where your Worker executes.

In addition to edge node availability, we could optimize for compute cost, bandwidth, or even latency in the future!

What’s actually executing these schedules?

To consume the schedules and actually trigger the Worker, we built a new service in Rust and deployed to our edge using HashiCorp Nomad. Nomad ensures that the schedule runner remains running in the edge node and can move it between machines as necessary. Rust was the best choice for this service since it needed to be fast with high availability and Cap’n Proto RPC support for calling into the runtime. With Tokio, Anyhow, Clap, and Serde, it was easy to quickly get the service up and running without having to really worry about async, error handling, or configuration.

On top of that, due to our specific needs for cron parsing, we built a specialized cron parser using nom that allowed us to quickly parse and compile expressions into values that check against a given time to determine if we should run a schedule.

Once the schedule runner has the schedules, it checks the time and selects the Workers that need to be run. To let the runtime know it’s time to run, we send a Cap'n Proto RPC message. The runtime then does its thing, calling the new ‘scheduled’ event handler instead of ‘fetch’.

How can I try this?

As of today, the Cron Triggers feature is live! Please try it out by creating a Worker and finding the Triggers tab - we’re excited to see what you build with it!

Categories: Technology

Pages

Additional Terms