Blogroll Category: Technology

I read blogs, as well as write one. The 'blogroll' on this site reproduces some posts from some of the people I enjoy reading. There are currently 60 posts from the category 'Technology.'

Disclaimer: Reproducing an article here need not necessarily imply agreement or endorsement!

Adopting a new approach to HTTP prioritization

CloudFlare - Tue, 31/12/2019 - 19:13
Adopting a new approach to HTTP prioritizationAdopting a new approach to HTTP prioritization

Friday the 13th is a lucky day for Cloudflare for many reasons. On December 13, 2019 Tommy Pauly, co-chair of the IETF HTTP Working Group, announced the adoption of the "Extensible Prioritization Scheme for HTTP" - a new approach to HTTP prioritization.

Web pages are made up of many resources that must be downloaded before they can be presented to the user. The role of HTTP prioritization is to load the right bytes at the right time in order to achieve the best performance. This is a collaborative process between client and server, a client sends priority signals that the server can use to schedule the delivery of response data. In HTTP/1.1 the signal is basic, clients order requests smartly across a pool of about 6 connections. In HTTP/2 a single connection is used and clients send a signal per request, as a frame, which describes the relative dependency and weighting of the response. HTTP/3 tried to use the same approach but dependencies don't work well when signals can be delivered out of order.

HTTP/3 is being standardised as part of the QUIC effort. As a Working Group (WG) we've been trying to fix the problems that non-deterministic ordering poses for HTTP priorities. However, in parallel some of us have been working on an alternative solution, the Extensible Prioritization Scheme, which fixes problems by dropping dependencies and using an absolute weighting. This is signalled in an HTTP header field meaning it can be backported to work with HTTP/2 or carried over HTTP/1.1 hops. The alternative proposal is documented in the Individual-Draft draft-kazuho-httpbis-priority-04, co-authored by Kazuho Oku (Fastly) and myself. This has now been adopted by the IETF HTTP WG as the basis of further work; It's adopted name will be draft-ietf-httpbis-priority-00.

To some extent document adoption is the end of one journey and the start of the next; sometimes the authors of the original work are not the best people to oversee the next phase. However, I'm pleased to say that Kazuho and I have been selected as co-editors of this new document. In this role we will reflect the consensus of the WG and help steward the next chapter of HTTP prioritization standardisation. Before the next journey begins in earnest, I wanted to take the opportunity to share my thoughts on the story of developing the alternative prioritization scheme through 2019.

I'd love to explain all the details of this new approach to HTTP prioritization but the truth is I expect the standardization process to refine the design and for things to go stale quickly. However, it doesn't hurt to give a taste of what's in store, just be aware that it is all subject to change.

A recap on priorities

The essence of HTTP prioritization comes down to trying to download many things over constrained connectivity. To borrow some text from Pat Meenan: Web pages are made up of dozens (sometimes hundreds) of separate resources that are loaded and assembled by a browser into the final displayed content. Since it is not possible to download everything immediately, we prefer to fetch more important things before less important ones. The challenge comes in signalling the importance from client to server.

In HTTP/2, every connection has a priority tree that expresses the relative importance between requests. Servers use this to determine how to schedule sending response data. The tree starts with a single root node and as requests are made they either depend on the root or each other. Servers may use the tree to decide how to schedule sending resources but clients cannot force a server to behave in any particular way.

To illustrate, imagine a client that makes three simple GET requests that all depend on root. As the server receives each request it grows its view of the priority tree:

Adopting a new approach to HTTP prioritizationThe server starts with only the root node of the priority tree. As requests arrive, the tree grows. In this case all requests depend on the root, so the requests are priority siblings.

Once all requests are received, the server determines all requests have equal priority and that it should send response data using round-robin scheduling: send some fraction of response 1, then a fraction of response 2, then a fraction of response 3, and repeat until all responses are complete.

A single HTTP/2 request-response exchange is made up of frames that are sent on a stream. A simple GET request would be sent using a single HEADERS frame:

Adopting a new approach to HTTP prioritizationHTTP/2 HEADERS frame, Each region of a frame is a named field

Each region of a frame is a named field, a '?' indicates the field is optional and the value in parenthesis is the length in bytes with '*' meaning variable length. The Header Block Fragment field holds compressed HTTP header fields (using HPACK), Pad Length and Padding relate to optional padding, and E, Stream Dependency and Weight combined are the priority signal that controls the priority tree.

The Stream Dependency and Weight fields are optional but their absence is interpreted as a signal to use the default values; dependency on the root with a weight of 16 meaning that the default priority scheduling strategy is round-robin . However, this is often a bad choice because important resources like HTML, CSS and JavaScript are tied up with things like large images. The following animation demonstrates this in the Edge browser, causing the page to be blank for 19 seconds. Our deep dive blog post explains the problem further.

Adopting a new approach to HTTP prioritization

The HEADERS frame E field is the interesting bit (pun intended). A request with the field set to 1 (true) means that the dependency is exclusive and nothing else can depend on the indicated node. To illustrate, imagine a client that sends three requests which set the E field to 1. As the server receives each request, it interprets this as an exclusive dependency on the root node. Because all requests have the same dependency on root, the tree has to be shuffled around to satisfy the exclusivity rules.

Adopting a new approach to HTTP prioritizationEach request has an exclusive dependency on the root node. The tree is shuffled as each request is received by the server.

The final version of the tree looks very different from our previous example. The server would schedule all of response 3, then all of response 2, then all of response 1. This could help load all of an HTML file before an image and thus improve the visual load behaviour.

In reality, clients load a lot more than three resources and use a mix of priority signals. To understand the priority of any single request, we need to understand all requests. That presents some technological challenges, especially for servers that act like proxies such as the Cloudflare edge network. Some servers have problems applying prioritization effectively.

Because not all clients send the most optimal priority signals we were motivated to develop Cloudflare's Enhanced HTTP/2 Prioritization, announced last May during Speed Week. This was a joint project between the Speed team (Andrew Galloni, Pat Meenan, Kornel Lesiński) and Protocols team (Nick Jones, Shih-Chiang Chien) and others. It replaces the complicated priority tree with a simpler scheme that is well suited to web resources. Because the feature is implemented on the server side, we avoid requiring any modification of clients or the HTTP/2 protocol itself. Be sure to check out my colleague Nick's blog post that details some of the technical challenges and changes needed to let our servers deliver smarter priorities.

The Extensible Prioritization Scheme proposal

The scheme specified in draft-kazuho-httpbis-priority-04, defines a way for priorities to be expressed in absolute terms. It replaces HTTP/2's dependency-based relative prioritization, the priority of a request is independent of others, which makes it easier to reason about and easier to schedule.

Rather than send the priority signal in a frame, the scheme defines an HTTP header - tentatively named "Priority" - that can carry an urgency on a scale of 0 (highest) to 7 (lowest). For example, a client could express the priority of an important resource by sending a request with:

Priority: u=0

And a less important background resource could be requested with:

Priority: u=7

While Kazuho and I are the main authors of this specification, we were inspired by several ideas in the Internet community, and we have incorporated feedback or direct input from many of our peers in the Internet community over several drafts. The text today reflects the efforts-so-far of cross-industry work involving many engineers and researchers including organizations such Adobe, Akamai, Apple, Cloudflare, Fastly, Facebook, Google, Microsoft, Mozilla and UHasselt. Adoption in the HTTP Working Group means that we can help improve the design and specification by spending some IETF time and resources for broader discussion, feedback and implementation experience.

The backstory

I work in Cloudflare's Protocols team which is responsible for terminating HTTP at the edge. We deal with things like TCP, TLS, QUIC, HTTP/1.x, HTTP/2 and HTTP/3 and since joining the company I've worked with Alessandro Ghedini, Junho Choi and Lohith Bellad to make QUIC and HTTP/3 generally available last September.

Working on emerging standards is fun. It involves an eclectic mix of engineering, meetings, document review, specification writing, time zones, personalities, and organizational boundaries. So while working on the codebase of quiche, our open source implementation of QUIC and HTTP/3, I am also mulling over design details of the protocols and discussing them in cross-industry venues like the IETF.

Because of HTTP/3's lineage, it carries over a lot of features from HTTP/2 including the priority signals and tree described earlier in the post.

One of the key benefits of HTTP/3 is that it is more resilient to the effect of lossy network conditions on performance; head-of-line blocking is limited because requests and responses can progress independently. This is, however, a double-edged sword because sometimes ordering is important. In HTTP/3 there is no guarantee that the requests are received in the same order that they were sent, so the priority tree can get out of sync between client and server. Imagine a client that makes two requests that include priority signals stating request 1 depends on root, request 2 depends on request 1. If request 2 arrives before request 1, the dependency cannot be resolved and becomes dangling. In such a case what is the best thing for a server to do? Ambiguity in behaviour leads to assumptions and disappointment. We should try to avoid that.

Adopting a new approach to HTTP prioritizationRequest 1 depends on root and request 2 depends on request 1. If an HTTP/3 server receives request 2 first, the dependency cannot be resolved.

This is just one example where things get tricky quickly. Unfortunately the WG kept finding edge case upon edge case with the priority tree model. We tried to find solutions but each additional fix seemed to create further complexity to the HTTP/3 design. This is a problem because it makes it hard to implement a server that handles priority correctly.

In parallel to Cloudflare's work on implementing a better prioritization for HTTP/2, in January 2019 Pat posted his proposal for an alternative prioritization scheme for HTTP/3 in a message to the IETF HTTP WG.

Arguably HTTP/2 prioritization never lived up to its hype. However, replacing it with something else in HTTP/3 is a challenge because the QUIC WG charter required us to try and maintain parity between the protocols. Mark Nottingham, co-chair of the HTTP and QUIC WGs responded with a good summary of the situation. To quote part of that response:

My sense is that people know that we need to do something about prioritisation, but we're not yet confident about any particular solution. Experimentation with new schemes as HTTP/2 extensions would be very helpful, as it would give us some data to work with. If you'd like to propose such an extension, this is the right place to do it.

And so started a very interesting year of cross-industry discussion on the future of HTTP prioritization.

A year of prioritization

The following is an account of my personal experiences during 2019. It's been a busy year and there may be unintentional errors or omissions, please let me know if you think that is the case. But I hope it gives you a taste of the standardization process and a look behind the scenes of how new Internet protocols that benefit everyone come to life.

January

Pat's email came at the same time that I was attending the QUIC WG Tokyo interim meeting hosted at Akamai (thanks to Mike Bishop for arrangements). So I was able to speak to a few people face-to-face on the topic. There was a bit of mailing list chatter but it tailed off after a few days.

February to April

Things remained quiet in terms of prioritization discussion. I knew the next best opportunity to get the ball rolling would be the HTTP Workshop 2019 held in April. The workshop is a multi-day event not associated with a standards-defining-organization (even if many of the attendees also go to meetings such as the IETF or W3C). It is structured in a way that allows the agenda to be more fluid than a typical standards meeting and gives plenty of time for organic conversation. This sometimes helps overcome gnarly problems, such as the community finding a path forward for WebSockets over HTTP/2 due to a productive discussion during the 2017 workshop. HTTP prioritization is a gnarly problem, so I was inspired to pitch it as a talk idea. It was selected and you can find the full slide deck here.

During the presentation I recounted the history of HTTP prioritization. The great thing about working on open standards is that many email threads, presentation materials and meeting materials are publicly archived. It's fun digging through this history. Did you know: HTTP/2 is based on SPDY and inherited its weight-based prioritization scheme, the tree-based scheme we are familiar with today was only introduced in draft-ietf-httpbis-http2-11? One of the reasons for the more-complicated tree was to help HTTP intermediaries (a.k.a. proxies) implement clever resource management. However, it became clear during the discussion that no intermediaries implement this, and none seem to plan to. I also explained a bit more about Pat's alternative scheme and Nick described his implementation experiences. Despite some interesting discussion around the topic however, we didn't come to any definitive solution. There were a lot of other interesting topics to discover that week.

May

In early May, Ian Swett (Google) restarted interest in Pat's mailing list thread. Unfortunately he was not present at the HTTP Workshop so had some catching up to do. A little while later Ian submitted a Pull Request to the HTTP/3 specification called "Strict Priorities". This incorporated Pat's proposal and attempted to fix a number of those prioritization edge cases that I mentioned earlier.

In late May, another QUIC WG interim meeting was held in London at the new Cloudflare offices, here is the view from the meeting room window. Credit to Alessandro for handling the meeting arrangements.

Thanks to @cloudflare for hosting our interop and interim meetings in London this week! pic.twitter.com/LIOA3OqEjr

— IETF QUIC WG (@quicwg) May 23, 2019

Mike, the editor of the HTTP/3 specification presented some of the issues with prioritization and we attempted to solve them with the conventional tree-based scheme. Ian, with contribution from Robin Marx (UHasselt), also presented an explanation about his "Strict Priorities" proposal. I recommend taking a look at Robin's priority tree visualisations which do a great job of explaining things. From that presentation I particularly liked "The prioritization spectrum", it's a concise snapshot of the state of things at that time:

Adopting a new approach to HTTP prioritizationAn overview of HTTP/3 prioritization issues, fixes and possible alternatives. Presented by Ian Swett at the QUIC Interim Meeting May 2019.June and July

Following the interim meeting, the prioritization "debate" continued electronically across GitHub and email. Some time in June Kazuho started work on a proposal that would use a scheme similar to Pat and Ian's absolute priorities. The major difference was that rather than send the priority signal in an HTTP frame, it would use a header field. This isn't a new concept, Roy Fielding proposed something similar at IETF 83.

In HTTP/2 and HTTP/3 requests are made up of frames that are sent on streams. Using a simple GET request as an example: a client sends a HEADERS frame that contains the scheme, method, path, and other request header fields. A server responds with a HEADERS frame that contains the status and response header fields, followed by DATA frame(s) that contain the payload.

To signal priority, a client could also send a PRIORITY frame. In the tree-based scheme the frame carries several fields that express dependencies and weights. Pat and Ian's proposals changed the contents of the PRIORITY frame. Kazuho's proposal encodes the priority as a header field that can be carried in the HEADERS frame as normal metadata, removing the need for the PRIORITY frame altogether.

I liked the simplification of Kazuho's approach and the new opportunities it might create for application developers. HTTP/2 and HTTP/3 implementations (in particular browsers) abstract away a lot of connection-level details such as stream or frames. That makes it hard to understand what is happening or to tune it.

The lingua franca of the Web is HTTP requests and responses, which are formed of header fields and payload data. In browsers, APIs such as Fetch and Service Worker allow handling of these primitives. In servers, there may be ways to interact with the primitives via configuration or programming languages. As part of Enhanced HTTP/2 Prioritization, we have exposed prioritization to Cloudflare Workers to allow rich behavioural customization. If a Worker adds the "cf-priority" header to a response, Cloudflare’s edge servers use the specified priority to serve the response. This might be used to boost the priority of a resource that is important to the load time of a page. To help inform this decision making, the incoming browser priority signal is encapsulated in the request object passed to a Worker's fetch event listener (request.cf.requestPriority).

Standardising approaches to problems is part of helping to build a better Internet. Because of the resonance between Cloudflare's work and Kazuho's proposal, I asked if he would consider letting me come aboard as a co-author. He kindly accepted and on July 8th we published the first version as an Internet-Draft.

Meanwhile, Ian was helping to drive the overall prioritization discussion and proposed that we use time during IETF 105 in Montreal to speak to a wider group of people. We kicked off the week with a short presentation to the HTTP WG from Ian, and Kazuho and I presented our draft in a side-meeting that saw a healthy discussion. There was a realization that the concepts of prioritization scheme, priority signalling and server resource scheduling (enacting prioritization) were conflated and made effective communication and progress difficult. HTTP/2's model was seen as one aspect, and two different I-Ds were created to deprecate it in some way (draft-lassey-priority-setting, draft-peon-httpbis-h2-priority-one-less). Martin Thomson (Mozilla) also created a Pull Request that simply removed the PRIORITY frame from HTTP/3.

To round off the week, in the second HTTP session it was decided that there was sufficient interest in resolving the prioritization debate via the creation of a design team. I joined the team led by Ian Swett along with others from Adobe, Akamai, Apple, Cloudflare, Fastly, Facebook, Google, Microsoft, and UHasselt.

August to October

Martin's PR generated a lot of conversation. It was merged under proviso that some solution be found before the HTTP/3 specification was finalized. Between May and August we went from something very complicated (e.g. Orphan placeholder, with PRIORITY only on control stream, plus exclusive priorities) to a blank canvas. The pressure was now on!

The design team held several teleconference meetings across the months. Logistics are a bit difficult when you have team members distributed across West Coast America, East Coast America, Western Europe, Central Europe, and Japan. However, thanks to some late nights and early mornings we managed to all get on the call at the same time.

In October most of us travelled to Cupertino, CA to attend another QUIC interim meeting hosted at Apple's Infinite Loop (Eric Kinnear helping with arrangements).  The first two days of the meeting were used for interop testing and were loosely structured, so the design team took the opportunity to hold the first face-to-face meeting. We made some progress and helped Ian to form up some new slides to present later in the week. Again, there was some useful discussion and signs that we should put some time in the agenda in IETF 106.

November

The design team came to agreement that draft-kazuho-httpbis-priority was a good basis for a new prioritization scheme. We decided to consolidate the various I-Ds that had sprung up during IETF 105 into the document, making it a single source that was easier for people to track progress and open issues if required. This is why, even though Kazuho and I are the named authors, the document reflects a broad input from the community. We published draft 03 in November, just ahead of the deadline for IETF 106 in Singapore.

Many of us travelled to Singapore ahead of the actual start of IETF 106. This wasn't to squeeze in some sightseeing (sadly) but rather to attend the IETF Hackathon. These are events where engineers and researchers can really put the concept of "running code" to the test. I really enjoy attending and I'm grateful to Charles Eckel and the team that organised it. If you'd like to read more about the event, Charles wrote up a nice blog post that, through some strange coincidence, features a picture of me, Kazuho and Robin talking at the QUIC table.

Link: https://t.co/8qP78O6cPS

— Lucas Pardue (@SimmerVigor) December 17, 2019

The design team held another face-to-face during a Hackathon lunch break and decided that we wanted to make some tweaks to the design written up in draft 03. Unfortunately the freeze was still in effect so we could not issue a new draft. Instead, we presented the most recent thinking to the HTTP session on Monday where Ian put forward draft-kazuho-httpbis-priority as the group's proposed design solution. Ian and Robin also shared results of prioritization experiments. We received some great feedback in the meeting and during the week pulled out all the stops to issue a new draft 04 before the next HTTP session on Thursday. The question now was: Did the WG think this was suitable to adopt as the basis of an alternative prioritization scheme? I think we addressed a lot of the feedback in this draft and there was a general feeling of support in the room. However, in the IETF consensus is declared via mailing lists and so Tommy Pauly, co-chair of the HTTP WG, put out a Call for Adoption on November 21st.

December

In the Cloudflare London office, preparations begin for mince pie acquisition and assessment.

The HTTP priorities team played the waiting game and watched the mailing list discussion. On the whole people supported the concept but there was one topic that divided opinion. Some people loved the use of headers to express priorities, some people didn't and wanted to stick to frames.

On December 13th Tommy announced that the group had decided to adopt our document and assign Kazuho and I as editors. The header/frame divide was noted as something that needed to be resolved.

The next step of the journey

Just because the document has been adopted does not mean we are done. In some ways we are just getting started. Perfection is often the enemy of getting things done and so sometimes adoption occurs at the first incarnation of a "good enough" proposal.

Today HTTP/3 has no prioritization signal. Without priority information there is a small danger that servers pick a scheduling strategy that is not optimal, that could cause the web performance of HTTP/3 to be worse than HTTP/2. To avoid that happening we'll refine and complete the design of the Extensible Priority Scheme. To do so there are open issues that we have to resolve, we'll need to square the circle on headers vs. frames, and we'll no doubt hit unknown unknowns. We'll need the input of the WG to make progress and their help to document the design that fits the need, and so I look forward to continued collaboration across the Internet community.

2019 was quite a ride and I'm excited to see what 2020 brings.

If working on protocols is your interest and you like what Cloudflare is doing, please visit our careers page. Our journey isn’t finished, in fact far from it.

Categories: Technology

Happy Holidays!

CloudFlare - Fri, 27/12/2019 - 19:30
Happy Holidays!

I joined Cloudflare in July of 2019, but I've known of Cloudflare for years. I always read the blog posts and looked at the way the company was engaging with the community. I also noticed the diversity in the names of many of the blog post authors.

There are over 50 languages spoken at Cloudflare, as we have natives from many countries on our team, with different backgrounds, religions, gender and cultures. And it is this diversity that makes us a great team.

A few days ago I asked one of my colleagues how he would say "Happy Holidays!" in Arabic. When I heard him say it, I instantly got the idea of recording a video in as many languages as possible of our colleagues wishing all of you, our readers and customers, a happy winter season.

It only took one internal message for people to start responding and sending their videos to me. Some did it themselves, others flocked in a meeting room and helped each other record their greeting. It took a few days and some video editing to put together an informal video that was entirely done by the team, to wish you all the best as we close this year and decade.

So here it is: Happy Holidays from all of us at Cloudflare!

Let us know if you speak any of the languages in the video. Or maybe you can tell us how you greet each other, at this time of the year, in your native language.

Categories: Technology

This holiday's biggest online shopping day was... Black Friday

CloudFlare - Tue, 24/12/2019 - 18:04
This holiday's biggest online shopping day was... Black Friday

What’s the biggest day of the holiday season for holiday shopping? Black Friday, the day after US Thanksgiving, has been embraced globally as the day retail stores announce their sales. But it was believed that the following Monday, dubbed “Cyber Monday,” may be even bigger. Or, with the explosion of reliable 2-day and even 1-day shipping, maybe another day closer to Christmas has taken the crown. At Cloudflare, we aimed to answer this question for the 2019 holiday shopping season.

Black Friday was the biggest online shopping day but the second biggest wasn't Cyber Monday... it was Thanksgiving Day itself (the day before Black Friday!). Cyber Monday was the fourth biggest day.

Here's a look at checkout events seen across Cloudflare's network since before Thanksgiving in the US.

This holiday's biggest online shopping day was... Black FridayCheckout events as a percentage of checkouts on Black Friday

The weekends are shown in yellow and Black Friday and Cyber Monday are shown in green. You can see that checkouts ramped up during Thanksgiving week and then continued through the weekend into Cyber Monday.

Black Friday had twice the number of checkouts as the preceding Friday and the entire Thanksgiving week dominates. Post-Cyber Monday, no day reached 50% of the number of checkouts we saw on Black Friday. And Cyber Monday was just 60% of Black Friday.

So, Black Friday is the peak day but Thanksgiving Day is the runner up. Perhaps it deserves its own moniker: Thrifty Thursday anyone?

Checkouts occur more frequently from Monday to Friday and then drop off over the weekend.  After Cyber Monday only one other day showed an interesting peak. Looking at last week it does appear that Tuesday, December 17 was the pre-Christmas peak for online checkouts. Perhaps fast online shipping made consumers feel they could use online shopping as long as they got their purchases by the weekend before Christmas.

Happy Holidays from everyone at Cloudflare!

Categories: Technology

First Half 2019 Transparency Report and an Update on a Warrant Canary

CloudFlare - Fri, 20/12/2019 - 21:49
First Half 2019 Transparency Report and an Update on a Warrant Canary

Today, we are releasing Cloudflare’s transparency report for the first half of 2019. We recognize the importance of keeping the reports current, but It’s taken us a little longer than usual to put it together. We have a few notable updates.

First Half 2019 Transparency Report and an Update on a Warrant CanaryPulling a warrant canary

Since we issued our very first transparency report in 2014, we’ve maintained a number of commitments - known as warrant canaries - about what actions we will take and how we will respond to certain types of law enforcement requests. We supplemented those initial commitments earlier this year, so that our current warrant canaries state that Cloudflare has never:

  1. Turned over our encryption or authentication keys or our customers' encryption or authentication keys to anyone.
  2. Installed any law enforcement software or equipment anywhere on our network.
  3. Terminated a customer or taken down content due to political pressure*
  4. Provided any law enforcement organization a feed of our customers' content transiting our network.
  5. Modified customer content at the request of law enforcement or another third party.
  6. Modified the intended destination of DNS responses at the request of law enforcement or another third party.
  7. Weakened, compromised, or subverted any of its encryption at the request of law enforcement or another third party.

These commitments serve as a statement of values to remind us what is important to us as a company, to convey not only what we do, but what we believe we should do. For us to maintain these commitments. we have to believe not only that we’ve met them in the past, but that we can continue to meet them.

Unfortunately, there is one warrant canary that no longer meets the test for remaining on our website. After Cloudlfare terminated the Daily Stormer’s service in 2017, Matthew observed:

"We're going to have a long debate internally about whether we need to remove the bullet about not terminating a customer due to political pressure. It's powerful to be able to say you've never done something. And, after today, make no mistake, it will be a little bit harder for us to argue against a government somewhere pressuring us into taking down a site they don't like."

We addressed this issue in our subsequent transparency reports by retaining the statement, but adding an asterisk identifying the Daily Stormer debate and the criticism that we had received in the wake of our decision to terminate services. Our goal was to signal that we remained committed to the principle that we should not terminate a customer due to political pressure, while not ignoring the termination. We also sought to be public about the termination and our reasons for the decision, ensuring that it would not go unnoticed.

Although that termination sparked significant debate about whether infrastructure companies making decisions about what content should remain online, we haven’t yet seen politically accountable actors put forth real alternatives to address deeply troubling content and behavior online. Since that time, we’ve seen even more real world consequences from the vitriol and hateful content spread online, from the screeds posted in connection with the terror attacks in Christchurch, Poway and El Paso to the posting of video glorifying those attacks. Indeed, in the absence of true public policy initiatives to address those concerns, the pressure on tech companies -- even deep Internet infrastructure companies like Cloudflare --  to make judgments about what stays online has only increased.  

In August 2019, Cloudflare terminated service to 8chan based on their failure to moderate their hate-filled platform in a way that inspired murderous acts. Although we don’t think removing cybersecurity services to force a site offline is the right public policy approach to the hate festering online, a site’s failure to take responsibility to prevent or mitigate the harm caused by its platform leaves service providers like us with few choices. We’ve come to recognize that the prolonged and persistent lawlessness of others might require action by those further down the technical stack. Although we’d prefer that governments recognize that need, and build mechanisms for due process, if they fail to act, infrastructure companies may be required to take action to prevent harm.

And that brings us back to our warrant canary. If we believe we might have an obligation to terminate customers, even in a limited number of cases, retaining a commitment that we will never terminate a customer “due to political pressure” is untenable. We could, in theory, argue that terminating a lawless customer like 8chan was not a termination “due to political pressure.”  But that seems wrong.  We shouldn’t be parsing specific words of our commitments to explain to people why we don’t believe we’ve violated the standard.

We remain committed to the principle that providing cybersecurity services to everyone, regardless of content, makes the Internet a better place. Although we’re removing the warrant canary from our website, we believe that to earn and maintain our users’ trust, we must be transparent about the actions we take. We therefore commit to reporting on any action that we take to terminate a user that could be viewed as a termination “due to political pressure.”

UK/US Cloud agreement

As we’ve described previously, governments have been working to find ways to improve law enforcement access to digital evidence across borders. Those efforts resulted in a new U.S. law, the Clarifying Lawful Overseas Use of Data (CLOUD) Act, premised on the idea that law enforcement around the world should be able to get access to electronic content related to their citizens when conducting law enforcement investigations, wherever that data is stored, as long as they are bound by sufficient procedural safeguards to ensure due process.

On October 3, 2019, the US and UK signed the first Executive Agreement under this law.  According to the requirements of U.S. law, that Agreement will go into effect in 180 days, in March 2020, unless Congress takes action to block it.  There is an ongoing debate as to whether the agreement includes sufficient due process and privacy protections. We’re going to take a wait and see approach, and will closely monitor any requests we receive after the agreement goes into effect.

For the time being, Cloudflare intends to comply with appropriately scoped and targeted requests for data from UK law enforcement, provided that those requests are consistent with the law and international human rights standards. Information about the legal requests that Cloudflare receives from non-U.S. governments pursuant to the CLOUD Act will be included in future transparency reports.

Categories: Technology

Dutch PHP Conference 2020

PHP - Fri, 20/12/2019 - 14:48
Categories: Technology

Dutch PHP Conference 2020

PHP - Fri, 20/12/2019 - 14:40
Categories: Technology

An Update on CDNJS

CloudFlare - Thu, 19/12/2019 - 19:30
An Update on CDNJS

When you loaded this blog, a file was delivered to your browser called jquery-3.2.1.min.js. jQuery is a library which makes it easier to build websites, and was at one point included on as many as 74.1% of all websites. A full eighteen million sites include jQuery and other libraries using one of the most popular tools on Earth: CDNJS. Beginning about a month ago Cloudflare began to take a more active role in the operation of CDNJS. This post is here to tell you more about CDNJS’ history and explain why we are helping to manage CDNJS.

What CDNJS Does

Virtually every site is composed of not just the code written by its developers, but also dozens or hundreds of libraries. These libraries make it possible for websites to extend what a web browser can do on its own. For example, libraries can allow a site to include powerful data visualizations, respond to user input, or even get more performant.

These libraries created wondrous and magical new capabilities for web browsers, but they can also cause the size of a site to explode. Particularly a decade ago, connections were not always fast enough to permit the use of many libraries while maintaining performance. But if so many websites are all including the same libraries, why was it necessary for each of them to load their own copy?

If we all load jQuery from the same place the browser can do a much better job of not actually needing to download it for every site. When the user visits the first jQuery-powered site it will have to be downloaded, but it will already be cached on the user's computer for any subsequent jQuery-powered site they might visit.

An Update on CDNJS

The first visit might take time to load:

An Update on CDNJS

But any future visit to any website pointing to this common URL would already be cached:

An Update on CDNJS<!-- Loaded only on my site, will need to be downloaded by every user --> <script src="./jquery.js"></script> <!-- Loaded from a common location across many sites --> <script src="https://cdnjs.cloudflare.com/jquery.js"></script>

Beyond the performance advantage, including files this way also made it very easy for users to experiment and create. When using a web browser as a creation tool users often didn't have elaborate build systems (this was also before npm), so being able to include a simple script tag was a boon. It's worth noting that it's not clear a massive performance advantage was ever actually provided by this scheme. It is becoming even less of a performance advantage now that browser vendors are beginning to use separate cache's for each website you visit, but with millions of sites using CDNJS there's no doubt it is a critical part of the web.

A CDN for all of us

My first Pull Request into the CDNJS project was in 2013. Back then if you created a JavaScript project it wasn't possible to have it included in the jQuery CDN, or the ones provided by large companies like Google and Microsoft. They were only for big, important, projects. Of course, however, even the biggest project starts small. The community needed a CDN which would agree to host nearly all JavaScript projects, even the ones which weren't world-changing (yet). In 2011, that project was launched by Ryan Kirkman and Thomas Davis as CDNJS.

The project was quickly wildly successful, far beyond their expectations. Their CDN bill quickly began to skyrocket (it would now be over a million dollars a year on AWS). Under the threat of having to shut down the service, Cloudflare was approached by the CDNJS team to see if we could help. We agreed to support their efforts and created cdnjs.cloudflare.com which serves the CDNJS project free of charge.

CDNJS has been astonishingly successful. The project is currently installed on over eighteen million websites (10% of the Internet!), offers files totaling over 1.5 billion lines of code, and serves over 173 billion requests a month. CDNJS only gets more popular as sites get larger, with 34% of the top 10k websites using the service. Each month we serve almost three petabytes of JavaScript, CSS, and other resources which power the web via cdnjs.cloudflare.com.

An Update on CDNJSSpikes can happen when a very large or popular site installs CDNJS, or when a misbehaving web crawler discovers a CDNJS link.

The future value of CDNJS is now in doubt, as web browsers are beginning to use a separate cache for every website you visit. It is currently used on such a wide swath of the web, however, it is unlikely it will be disappearing any time soon.

How CDNJS Works

CDNJS starts with a Github repo. That project contains every file served by CDNJS, at every version which it has ever offered. That’s 182 GB without the commit history, over five million files, and over 1.5 billion lines of code.

Given that it stores and delivers versioned code files, in many ways it was the Internet’s first JavaScript package manager. Unlike other package managers and even other CDNs everything CDNJS serves is publicly versioned. All 67,724 commits! This means you as a user can verify that you are being served files which haven’t been tampered with.

To make changes to CDNJS a commit has to be made. For new projects being added to CDNJS, or when projects change significantly, these commits are made by humans, and get reviewed by other humans. When projects just release new versions there is a bot made by Peter and maintained by Sven which sucks up changes from npm and automatically creates commits.

Within Cloudflare’s infrastructure there is a set of machines which are responsible for pulling the latest version of the repo periodically. Those machines then become the origin for cdnjs.cloudflare.com, with Cloudflare’s Global Load Balancer automatically handling failures. Cloudflare’s cache automatically stores copies of many of the projects making it possible for us to deliver them quickly from all 195 of our data centers.

An Update on CDNJSThe Internet on a Shoestring Budget

The CDNJS project has always been administered independently of Cloudflare. In addition to the founders, the project has additionally been maintained by exceptionally hard-working caretakers like Peter and Matt Cowley. Maintaining a single repo of nearly every frontend project on Earth is no small task, and it has required a substantial amount of both manual work and bot development.

Unfortunately approximately thirty days ago one of those bots stopped working, preventing updated projects from appearing in CDNJS. The bot's open-source maintainer was not able to invest the time necessary to keep the bot running. After several weeks we were asked by the community and the CDNJS founders to take over maintenance of the CDNJS repo itself. This means the Cloudflare engineering team is taking responsibility for keeping the contents of github.com/cdnjs/cdnjs up to date, in addition to ensuring it is correctly served on cdnjs.cloudflare.com.

We agreed to do this because we were, frankly, scared. Like so many open-source projects CDNJS was a critical part of our world, but wasn’t getting the attention it needed to survive. The Internet relies on CDNJS as much as on any other single project, losing it or allowing it to be commandeered would be catastrophic to millions of websites and their visitors. If it began to fail, some sites would adapt and update, others would be broken forever.

CDNJS has always been, and remains, a project for and by the community. We are invested in making all decisions in a transparent and inclusive manner. If you are interested in contributing to CDNJS or in the topics we're currently discussing please visit the CDNJS Github Issues page.

An Update on CDNJSA Plan for the Future

One example of an area where we could use your help is in charting a path towards a CDNJS which requires less manual moderation. Nothing can replace the intelligence and creativity of a human (yet), but for a task like managing what resources go into a CDN, it is error prone and time consuming. At present a human has to review every new project to be included, and often has to take additional steps to include new versions of a project.

As a part of our analysis of the project we examined a snapshot of the still-open PRs made against CDNJS for several months:

An Update on CDNJS

The vast majority of these PRs were changes which ultimately passed the automated review but nevertheless couldn't be merged without manual review.

There is consensus that we should move to a model which does not require human involvement in most cases. We would love your input and collaboration on the best way for that to be solved. If this is something you are passionate about, please contribute here.

Our plan is to support the CDNJS project in whichever ways it requires for as long as the Internet relies upon it. We invite you to use CDNJS in your next project with the full assurance that it is backed by the same network and team who protect and accelerate over twenty million of your favorite websites across the Internet. We are also planning more posts diving further into the CDNJS data, subscribe to this blog if you would like to be notified upon their release.

Categories: Technology

Drupal core - Moderately critical - Access bypass - SA-CORE-2019-011

Drupal Security - Wed, 18/12/2019 - 18:16
Project: Drupal coreVersion: 8.8.x-dev8.7.x-devDate: 2019-December-18Security risk: Moderately critical 10∕25 AC:Basic/A:User/CI:Some/II:None/E:Theoretical/TD:DefaultVulnerability: Access bypassDescription: 

The Media Library module has a security vulnerability whereby it doesn't sufficiently restrict access to media items in certain configurations.

Solution: 
  • If you are using Drupal 8.7.x, you should upgrade to Drupal 8.7.11.
  • If you are using Drupal 8.8.x, you should upgrade to Drupal 8.8.1.

Versions of Drupal 8 prior to 8.7.x are end-of-life and do not receive security coverage.

Alternatively, you may mitigate this vulnerability by unchecking the "Enable advanced UI" checkbox on /admin/config/media/media-library. (This mitigation is not available in 8.7.x.)

Reported By: Fixed By: 
Categories: Technology

Drupal core - Critical - Multiple vulnerabilities - SA-CORE-2019-012

Drupal Security - Wed, 18/12/2019 - 18:13
Project: Drupal coreVersion: 8.8.x-dev8.7.x-dev7.x-devDate: 2019-December-18Security risk: Critical 17∕25 AC:Basic/A:User/CI:All/II:All/E:Proof/TD:UncommonVulnerability: Multiple vulnerabilitiesDescription: 

The Drupal project uses the third-party library Archive_Tar, which has released a security update that impacts some Drupal configurations.

Multiple vulnerabilities are possible if Drupal is configured to allow .tar, .tar.gz, .bz2 or .tlz file uploads and processes them.

The latest versions of Drupal update Archive_Tar to 1.4.9 to mitigate the file processing vulnerabilities.

Solution: 

Install the latest version:

Versions of Drupal 8 prior to 8.7.x are end-of-life and do not receive security coverage.

Reported By: Fixed By: 
Categories: Technology

Drupal core - Moderately critical - Multiple vulnerabilities - SA-CORE-2019-010

Drupal Security - Wed, 18/12/2019 - 18:07
Project: Drupal coreVersion: 8.8.x-dev8.7.x-devDate: 2019-December-18Security risk: Moderately critical 14∕25 AC:Basic/A:Admin/CI:Some/II:All/E:Theoretical/TD:DefaultVulnerability: Multiple vulnerabilitiesDescription: 

Drupal 8 core's file_save_upload() function does not strip the leading and trailing dot ('.') from filenames, like Drupal 7 did.

Users with the ability to upload files with any extension in conjunction with contributed modules may be able to use this to upload system files such as .htaccess in order to bypass protections afforded by Drupal's default .htaccess file.

After this fix, file_save_upload() now trims leading and trailing dots from filenames.

Solution: 

Install the latest version:

  • If you use Drupal core 8.7.x: 8.7.11
  • If you use Drupal core 8.8.x: 8.8.1

Versions of Drupal 8 prior to 8.7.x are end-of-life and do not receive security coverage.

Reported By: Fixed By: 
Categories: Technology

Announcing the CSAM Scanning Tool, Free for All Cloudflare Customers

CloudFlare - Wed, 18/12/2019 - 18:02
Announcing the CSAM Scanning Tool, Free for All Cloudflare Customers

Two weeks ago we wrote about Cloudflare's approach to dealing with child sexual abuse material (CSAM). We first began working with the National Center for Missing and Exploited Children (NCMEC), the US-based organization that acts as a clearinghouse for removing this abhorrent content, within months of our public launch in 2010. Over the last nine years, our Trust & Safety team has worked with NCMEC, Interpol, and nearly 60 other public and private agencies around the world to design our program. And we are proud of the work we've done to remove CSAM from the Internet.

The most repugnant cases, in some ways, are the easiest for us to address. While Cloudflare is not able to remove content hosted by others, we will take steps to terminate services to a website when it becomes clear that the site is dedicated to sharing CSAM or if the operators of the website and its host fail to take appropriate steps to take down CSAM content. When we terminate websites, we purge our caches — something that takes effect within seconds globally — and we block the website from ever being able to use Cloudflare's network again.

Addressing the Hard Cases

The hard cases are when a customer of ours runs a service that allows user generated content (such as a discussion forum) and a user uploads CSAM, or if they’re hacked, or if they have a malicious employee that is storing CSAM on their servers. We've seen many instances of these cases where services intending to do the right thing are caught completely off guard by CSAM that ended up on their sites. Despite the absence of intent or malice in these cases, there’s still a need to identify and remove that content quickly.

Today we're proud to take a step to help deal with those hard cases. Beginning today, every Cloudflare customer can login to their dashboard and enable access to the CSAM Scanning Tool. As the CSAM Scanning Tool moves through development to production, the tool will check all Internet properties that have enabled CSAM Scanning for this illegal content. Cloudflare will automatically send a notice to you when it flags CSAM material, block that content from being accessed (with a 451 “blocked for legal reasons” status code), and take steps to support proper reporting of that content in compliance with legal obligations.

Announcing the CSAM Scanning Tool, Free for All Cloudflare Customers

CSAM Scanning will be available via the Cloudflare dashboard at no cost for all customers regardless of their plan level. You can find this tool under the “Caching” tab in your dashboard. We're hopeful that by opening this tool to all our customers for free we can help do even more to counter CSAM online and help protect our customers from the legal and reputational risk that CSAM can pose to their businesses.

It has been a long journey to get to the point where we could commit to offering this service to our millions of users. To understand what we're doing and why it has been challenging from a technical and policy perspective, you need to understand a bit about the state of the art of tracking CSAM.

Finding Similar Images

Around the same time as Cloudflare was first conceived in 2009, a Dartmouth professor named Hany Farid was working on software that could compare images against a list of hashes maintained by NCMEC. Microsoft took the lead in creating a tool, PhotoDNA, that used Prof. Farid’s work to identify CSAM automatically.

In its earliest days, Microsoft used PhotoDNA for their services internally and, in late 2009, donated the technology to NCMEC to help manage its use by other organizations. Social networks were some of the first adopters. In 2011, Facebook rolled out an implementation of the technology as part of their abuse process. Twitter incorporated it in 2014.

The process is known as a fuzzy hash. Traditional hash algorithms like MD5, SHA1, and SHA256 take a file (such as an image or document) of arbitrary length and output a fixed length number that is, effectively, the file’s digital fingerprint. For instance, if you take the MD5 of this picture then the resulting fingerprint is 605c83bf1bba62e85f4f5fccc56bc128.

Announcing the CSAM Scanning Tool, Free for All Cloudflare CustomersThe base image

If we change a single pixel in the picture to be slightly off white rather than pure white, it's visually identical but the fingerprint changes completely to 42ea4fb30a440d8787477c6c37b9daed. As you can see from the two fingerprints, a small change to the image results in a massive and unpredictable change to the output of a traditional hash.

Announcing the CSAM Scanning Tool, Free for All Cloudflare CustomersThe base image with a single pixel changed

This is great for some uses of hashing where you want to definitively identify if the document you're looking at is exactly the same as the one you've seen before. For example, if an extra zero is added to a digital contract, you want the hash of the document used in its signature to no longer be valid.

Fuzzy Hashing

However, in the case of CSAM, this characteristic of traditional hashing is a liability. In order to avoid detection, the criminals producing CSAM resize, add noise, or otherwise alter the image in such a way that it looks the same but it would result in a radically different hash.

Fuzzy hashing works differently. Instead of determining if two photos are exactly the same it instead attempts to get at the essence of a photograph. This allows the software to calculate hashes for two images and then compare the "distance" between the two. While the fuzzy hashes may still be different between two photographs that have been altered, unlike with traditional hashing, you can compare the two and see how similar the images are.

So, in the two photos above, the fuzzy hash of the first image is

00e308346a494a188e1042333147267a 653a16b94c33417c12b433095c318012 5612442030d1484ce82c613f4e224733 1dd84436734e4a5c6e25332e507a8218 6e3b89174e30372d

and the second image is

00e308346a494a188e1042333147267a 653a16b94c33417c12b433095c318012 5612442030d1484ce82c613f4e224733 1dd84436734e4a5c6e25332e507a8218 6e3b89174e30372d

There's only a slight difference between the two in terms of pixels and the fuzzy hashes are identical.

Announcing the CSAM Scanning Tool, Free for All Cloudflare CustomersThe base image after increasing the saturation, changing to sepia, adding a border and then adding random noise.

Fuzzy hashing is designed to be able to identify images that are substantially similar. For example, we modified the image of dogs by first enhancing its color, then changing it to sepia, then adding a border and finally adding random noise.  The fuzzy hash of the new image is

00d9082d6e454a19a20b4e3034493278 614219b14838447213ad3409672e7d13 6e0e4a2033de545ce731664646284337 1ecd4038794a485d7c21233f547a7d2e 663e7c1c40363335

This looks quite different from the hash of the unchanged image above, but fuzzy hashes are compared by seeing how close they are to each other.

The largest possible distance between two images is about 5m units. These two fuzzy hashes are just 4,913 units apart (the smaller the number, the more similar the images) indicating that they are substantially the same image.

Compare that with two unrelated photographs. The photograph below has a fuzzy hash of

011a0d0323102d048148c92a4773b60d 0d343c02120615010d1a47017d108b14 d36fff4561aebb2f088a891208134202 3e21ff5b594bff5eff5bff6c2bc9ff77 1755ff511d14ff5b Announcing the CSAM Scanning Tool, Free for All Cloudflare Customers

The photograph below has a fuzzy hash of

062715154080356b8a52505955997751 9d221f4624000209034f1227438a8c6a 894e8b9d675a513873394a2f3d000722 781407ff475a36f9275160ff6f231eff 465a17f1224006ff Announcing the CSAM Scanning Tool, Free for All Cloudflare Customers

The distance between the two hashes is calculated as 713,061. Through experimentation, it's possible to set a distance threshold under which you can consider two photographs to be likely related.

Fuzzy Hashing's Intentionally Black Box

How does it work? While there has been lots of work on fuzzy hashing published, the innards of the process are intentionally a bit of a mystery. The New York Times recently wrote a story that was probably the most public discussion of one such technology works. The challenge was if criminal producers and distributors of CSAM knew exactly how such tools worked then they might be able to craft how they alter their images in order to beat it. To be clear, Cloudflare will be running the CSAM Screening Tool on behalf of the website operator from within our secure points of presence. We will not be distributing the software directly to users. We will remain vigilant for potential attempted abuse of the platform, and will take prompt action as necessary.

Tradeoff Between False Negatives and False Positives

We have been working with a number of authorities on how we can best roll it out this functionality to our customers. One of the challenges for a network with as diverse a set of customers as Cloudflare's is what the appropriate threshold should be to set the comparison distance between the fuzzy hashes.

If you set the threshold too strict — meaning that it's closer to a traditional hash and two images need to be virtually identical to trigger a match — then you're more likely to have have many false negatives (i.e., CSAM that isn't flagged). If you set the threshold too loose, then it's possible to have many false positives. False positives may seem like the lesser evil, but there are legitimate concerns that increasing the possibility of false positives at scale could waste limited resources and further overwhelm the existing ecosystem.  We will work to iterate the CSAM Scanning Tool to provide more granular control to the website owner while supporting the ongoing effectiveness of the ecosystem. Today, we believe we can offer a good first set of options for our customers that will allow us to more quickly flag CSAM without overwhelming the resources of the ecosystem.

Different Thresholds for Different Customers

The same desire for a granular approach was reflected in our conversations with our customers. When we asked what was appropriate for them, the answer varied radically based on the type of business, how sophisticated its existing abuse process was, and its likely exposure level and tolerance for the risk of CSAM being posted on their site.

For instance, a mature social network using Cloudflare with a sophisticated abuse team may want the threshold set quite loose, but not want the material to be automatically blocked because they have the resources to manually review whatever is flagged.

A new startup dedicated to providing a forum to new parents may want the threshold set quite loose and want any hits automatically blocked because they haven't yet built a sophisticated abuse team and the risk to their brand is so high if CSAM material is posted -- even if that will result in some false positives.

A commercial financial institution may want to set the threshold quite strict because they're less likely to have user generated content and would have a low tolerance for false positives, but then automatically block anything that's detected because if somehow their systems are compromised to host known CSAM they want to stop it immediately.

Different Requirements for Different Jurisdictions

There also may be challenges based on where our customers are located and the laws and regulations that apply to them. Depending on where a customers business is located and where they have users, they may choose to use one, more than one, or all the different available hash lists.

In other words, one size does not fit all and, ideally, we believe allowing individual site owners to set the parameters that make the most sense for their particular site will result in lower false negative rates (i.e., more CSAM being flagged) than if we try and set one global standard for every one of our customers.

Improving the Tool Over Time

Over time, we are hopeful that we can improve CSAM screening for our customers. We expect that we will add additional lists of hashes from numerous global agencies for our customers with users around the world to subscribe to. We're committed to enabling this flexibility without overly burdening the ecosystem that is set up to fight this horrible crime.

Finally, we believe there may be an opportunity to help build the next generation of fuzzy hashing. For example, the software can only scan images that are at rest in memory on a machine, not those that are streaming. We're talking with Hany Farid, the former Dartmouth professor who now teaches at Berkeley California, about ways that we may be able to build a more flexible fuzzy hashing system in order to flag images before they're even posted.

Concerns and Responsibility

One question we asked ourselves back when we began to consider offering CSAM scanning was whether we were the right place to be doing this at all. We share the universal concern about the distribution of depictions of horrific crimes against children, and believe it should have no place on the Internet, however Cloudflare is a network infrastructure provider, not a content platform.

But we thought there was an appropriate role for us to play in this space. Fundamentally, Cloudflare delivers tools to our more than 2 million customers that were previously reserved for only the Internet giants. The security, performance, and reliability services that we offer, often for free, without us would have been extremely expensive or limited to the Internet giants like Facebook and Google.

Today there are startups that are working to build the next Internet giant and compete with Facebook and Google because they can use Cloudflare to be secure, fast, and reliable online. But, as the regulatory hurdles around dealing with incredibly difficult issues like CSAM continue to increase, many of them lack access to sophisticated tools to scan proactively for CSAM. You have to get big to get into the club that gives you access to these tools, and, concerningly, being in the club is increasingly a prerequisite to getting big.

If we want more competition for the Internet giants we need to make these tools available more broadly and to smaller organizations. From that perspective, we think it makes perfect sense for us to help democratize this powerful tool in the fight against CSAM.

We hope this will help enable our customers to build more sophisticated content moderation teams appropriate for their own communities and will allow them to scale in a responsible way to compete with the Internet giants of today. That is directly aligned with our mission of helping build a better Internet, and it's why we're announcing that we will be making this service available for free for all our customers.

Categories: Technology

Drupal core - Moderately critical - Denial of Service - SA-CORE-2019-009

Drupal Security - Wed, 18/12/2019 - 18:01
Project: Drupal coreVersion: 8.8.x-dev8.7.x-devDate: 2019-December-18Security risk: Moderately critical 12∕25 AC:None/A:None/CI:None/II:None/E:Theoretical/TD:AllVulnerability: Denial of ServiceDescription: 

A visit to install.php can cause cached data to become corrupted. This could cause a site to be impaired until caches are rebuilt.

Solution: 

Install the latest version:

Versions of Drupal 8 prior to 8.7.x are end-of-life and do not receive security coverage.

To mitigate this issue in any version of Drupal 8, you can also block access to install.php if it's not required.

Reported By: Fixed By: 
Categories: Technology

PHPBenelux Conference 2020

PHP - Wed, 18/12/2019 - 13:36
Categories: Technology

PHP 7.2.26 Released

PHP - Wed, 18/12/2019 - 13:02
Categories: Technology

PHP 7.3.13 Released

PHP - Wed, 18/12/2019 - 12:33
Categories: Technology

PHP 7.4.1 Released!

PHP - Wed, 18/12/2019 - 11:40
Categories: Technology

A new mom’s guide to pumping milk while traveling for work

CloudFlare - Tue, 17/12/2019 - 21:33
A new mom’s guide to pumping milk while traveling for work

Recently, I deployed a human to production. Shipped at 11 lbs 3 oz, he rapidly doubled in size in his first six months. At Cloudflare, I run the Developer Relations team, and my first quarter back from parental leave, I had 3 business trips: 2 international, 1 domestic. As an exclusive breastfeeder, this means solving the logistical puzzle of moving a large quantity of milk home, to the tune of 40-50 oz (1200 - 1500 mL) per day given the size of my baby.

Since I ferried milk home to my baby and did extensive research in preparation, I figured I'd pay it forward and share my own learnings, and publish the guide that I wished someone wrote for me. In the final section for further reading, I've linked many of the articles I read in preparation although some of the advice from the reading is rather dated. I’m including them because I’m grateful to be standing on the shoulders of giants and accumulating the wisdom of all the parents who went on this adventure before me. What's possible in 2019 is truly amazing compared to a generation ago or even half to one decade ago.

Before I dive into the advice, I’d like to thank the other parents in our Parents group for their advice and help, our Office Team for maintaining a mother’s room in every HQ (even 2 in San Francisco), our People Team for their support of the Parents Employee Resource Group (ERG) and for helping me research insurance related questions, and Cloudflare in general for family friendly policies. (See career opportunities at Cloudflare )

What's in my pump bag?

When packing my pump bag, I packed for 2 pumping use cases: 1) pumping on the airplane or in a non-ideal (nursing room) area of an airport, and 2) pumping in a conference-provided mother’s room or a non-ideal private area. Here’s my packing list:

Pump Bag packing list (and notes):
  • Insulated cooler backpack

    I used an Igloo cooler bag because it was large enough to accommodate a smaller insulated milk bag and had separate compartments so I can access pump items without subjecting the inside to warm / room temperature air.
  • Insulated milk bag
  • Travel pump and bottles
  • Baby Buddha pump

    It charges via USB so I can use my power brick as a backup. This was recommended by another parent in the Parents ERG group. My first trip I packed my Baby Buddha, my Willow set, and my manual pumps for the trip, but I really relied on the Baby Buddha for all my subsequent trips. (At home I use Spectra, and at work we share a Medela hospital grade pump. I suppose I’m pumped to be a pump enthusiast.) On subsequent trips, I no longer packed bottles and went exclusively Kiinde + Baby Buddha.
  • Pump cleaning wipes

    I used Medela pump wipes. A large box came with a microwave sterilizer bag.
  • 2 refrigerator thermometers (see temperature management section below)
  • Extra gallon ziplock bags (a lot of them)
  • Printouts

    If you are traveling in the U.S., I recommend printing out these two TSA policy pages. Many airlines allow your medical device (e.g., breast pump) to be a separate carry-on item. Before my trips, I printed the airline policy page that states this for each airline I had flights with and stored it in my pump bag. Although it didn’t come in necessary, I’m glad it was there just in case. Each airline may have a different policy, so call each airline and confirm that the pump bag did not count against carry-on limits, even though you’re also printing out the policy from their website.
  • Sharpie to label milk bags
  • Travel pump parts cleaning kit
  • Ice Packs (must be frozen solid to pass through security)

    It is possible for an insulated milk bag inside a cooler backpack to maintain <4C for a 10 hour flight, I’ve done it. However, it will require multiple replenishments of ziplock bags of ice. Use the long lasting type of ice pack (not the ones for treating injuries) so they stay frozen solid, not only through security but throughout the flight.
  • Manual pump as a backup

    I have both a Haakaa and a NatureBond manual pump for worst case scenario planning. Although I didn’t end up needing them during the flight, I would still pack them in the future to literally take the pressure off in tight situations.
Regular personal item packing list:
  • Hand sanitizer wipes (wipes are not a liquid and don’t count toward your quota, hooray!)
  • Baby bottle dish soap, travel size (with your liquids)
  • Big tupperware container (makeshift dishpan for washing pump parts)
  • Battery pack backup in case I can’t find an outlet (optional)
  • Nursing cover for pumping in my seat (optional)
Pro-Tips and lessons I learned from my journey:
  • Pre-assemble pump parts and store in ziplock bags; each pump separately. My preference for a 10-12 hour flight is to pack 3 kits: each kit is one pump part set, fully assembled with adapter and Kiinde bag, and an extra Kiinde bag. I’d then pump one side at a time, using the same kit, swap bags between sides (or when one was full). Since the Baby Buddha comes with 2 sets (one for each side), I used a Spectra set from home and the unofficial Baby Buddha component hacks. Even though I had pump wipes, I saved all my pump sets for a proper deep clean after getting to the hotel.
  • If/when I do need to clean pump parts on the plane (e.g., if, due to delays, I need to pump more times), I use my Medela pump wipes.
  • Make friends with a flight attendant and let them know how they can be helpful, e.g., fill your ziplock bags with a steady supply of ice.
  • Pack a lot of gallon ziplock bags. I packed over a dozen gallon ziplock bags for my first trip. It wasn't nearly enough. I recommend packing half a package of gallon size and half a package of quart sized with the zipper top. Asking for ice from airport vendors, flight attendants, storing a used pump I'll wash later, everything uses a fresh bag.
  • Stockpile frozen milk before your trip. I estimated that my baby consumes 40-50 oz per day, and I had nearly enough for a one week trip. It turned out that he consumed less milk than expected, and there was still some frozen milk in the freezer when I got home.
  • What happens if I need to be away from my hotel or a fridge for more than 3-4 hours? I pack my manual pump in my purse for post-conference social outings, and go to the bathroom and pump and dump just enough to take a little pressure off the top, and pump as soon as I get back to my hotel.
  • Once on the plane, you have several options as to where to pump. Some people pump in the bathrooms, but I prefer getting a window seat and pumping under a tulip style nursing top, which provides a similar amount of privacy to a nursing cover, but is much more maneuverable.
  • Liquid or gel hand sanitizer counts as a liquid for security purposes. My strategy is to rely on a combination of Medela pump wipes (FSA eligible) and hand sanitizer wipes. Your own comfort level with pumping at your seat may differ from mine, but I used hand sanitizer wipes on my hands (and arm rests, etc.) and another to wipe down the tray table for my pumping items. All cleaning after those 2 wipes were with the alcohol free pump wipes.
Where can a mother pump around this airport / conference center / anywhere?

Many airports have mother's rooms, family bathrooms, nursery rooms, (see list and list) or Mamava pods. Personally, I didn't want to be in a rush to finish pumping or washing parts while boarding for my flight gets announced, even though I’m very grateful they exist and would use them when there’s a flight delay.

The ANA checkin agent thoughtfully warned me that Tokyo airport security won't allow milk as a carry-on liquid above the volume limits on the way back. Luckily, I was already planning to check it in a Milk Stork box, and cool my last batch in an ice bath before sealing the box. Milk Stork has compiled this helpful list of links to the airport security policy of different countries. The Points Guy blog compiled a helpful list of airline policies on whether a medical device (breast pump) qualifies as an additional carry on.

On my phone, I have installed these 2 apps for locating mother’s rooms: Pumpspotting and Mamava. Your luck with either of them will depend on the country you’re in. In Japan, for instance, nearly every shopping mall and hotel lobby seemed to have a “nursery” room which fit the bill, but almost none are on the apps. North America is better represented on the apps.

As a first resort, however, check with the conference organizers about a mother’s room. Developer Week and dotJS have both done a phenomenal job in making a mother’s room available. In one case, when I asked, the organizers learned that the venue already had a fully equipped mother’s room on site. In another case, the organizers worked with the venue to create a lockable private room with a refrigerator.

Don’t be afraid of being the first person to request a mothers’ room of a conference organizer. You get zero percent of the things you don’t ask for, and worst case scenario, if there isn’t one, you may need to take a trip back to your hotel room for a mid-day pump session.

Temperature management: is my cooler or refrigerator cold enough?

My general approach to temperature management, be it hotel room refrigerators, my cooler backpack or my mini cooler bag, is trust but verify. I bought 2 little thermometers, and on the road, I had one in the small bag, and one in the large bag but outside the small bag. This allowed me to be sure that my milk is cold enough no matter where I was.

A new mom’s guide to pumping milk while traveling for workLeft: small milk cooler bag (Skip Hop from Target); Right: Igloo cooler backpack. I liked this bag because it has a top pocket, where I can quick access my pump and pump parts without opening the main compartment and exposing it to room temperature air.

The main igloo compartment stabilized at 13C and the internal bag stabilized at 8C with 2 long lasting gel ice packs. When I asked for additional ice from the flight attendants, it stabilized at 10C in the main compartment and 3C in the internal bag. After a while, I got an ice refresh and the internal compartment stabilized at 1C.

To prevent newly pumped warm milk from warming up the cooler, I used an external ziplock ice bath to rapid chill the milk bag before storing it in the cooler. For this reason, I preferred the stability of the Kiinde twist top bags and not being afraid of bursting a ziplock seal.

A new mom’s guide to pumping milk while traveling for workAlways trust but verify temperature. Milk bags in a full size hotel refrigerator with fridge thermometer

Some hotel room mini fridges are cold enough and others aren't. Same with large refrigerators. Just like with cooler backpacks, my general advice is trust but verify. With the two little thermometers: I took the one outside the internal insulated pouch and put it in the fridge to measure the refrigerator temperature before unpacking the cooler backpack.

  • At an Extended Stay Austin, I had to switch rooms to get a cold enough fridge. In the first room, the full sized refrigerator stabilized at 8C at its max, and couldn’t get colder, and the front desk people were happy to switch me to another room with a colder fridge, which was cold enough.
  • My fridge in the Tokyo hotel stabilized at 8-9C when I put milk bags in, but can get down to 4C when it's not trying to cool warm milk. So I had the hotel store my milk in their fridge with my igloo cooler backpack. 1 thermometer in hotel fridge, one in backpack, so I can confirm their fridge is cold enough at 3-4C.
  • My fridge in Paris was an old and weak little fridge that can get to 10C in ideal conditions, so I kept my milk at 4C in that fridge with a twice daily addition of ziplock bags full of ice provided by the hotel.

Lastly, some rooms have their power modulated by the key card in a slot by the door, and the refrigerator turns off when you're not there. Don't feel bad about using a business card to keep the power on so the refrigerator can stay on.

Milk Stork vs. OrderBoxesNow

Milk Stork came highly recommended by parents in our Parent Chat channel, as used by their spouses at other companies and there’s currently internal discussion about potentially offering it as a benefit in the future.

Since my baby is very large (99th percentile), he consumes 40 - 50 oz per day (1200 - 1500 mL per day). That means Milk Stork’s large 100 oz box is 2 - 2.5 days supply for my baby, whereas that’s close to a one week supply for a regular sized baby of a similar age. So I decided to try Milk Stork kits for some trips and compare the experience with buying replacement engines and/or boxes myself for other trips in order to compare the experience.

And oh what a difference. I don’t have enough words for Milk Stork customer service. Milk Stork isn’t just a box with a refrigeration engine. You give them your trip information and they ship an entire kit to your hotel, which includes: the box and the refrigeration engine, milk storage bags, tamper evident seals, etc. Although you have to arrange the FedEx pickup yourself (and coordinate with your hotel front desk), they will pay for the freight and take care of the rest. When there was a hiccup with my FedEx pickup, and when I got a surprise FedEx invoice for import taxes on my milk, Milk Stork customer support got on the phone with FedEx to reverse the charges, saving me the headache of multiple phone calls.

It is incredibly easy to make minor mistakes when buying replacement engines instead. On one trip, I brought my empty Milk Stork boxes to re-use and shipped replacement engines to the hotel. Not only did I have a slight panic because the hotel at first thought they didn’t have my replacement engines, it also turned out that I had ordered the wrong size. After a last minute trip to Home Depot for some supplies (zip ties, tape, bubble wrap), I was able to disassemble the two Milk Stork coolers into panels and MacGyver them together into a functional franken-cooler that was the correct size for the refrigeration engine that I used for multiple trips. Since this required pulling an all-nighter due to regular pumping interruptions, this is not for the faint of heart.

A new mom’s guide to pumping milk while traveling for workMacGyvered franken-cooler box, constructed from 2 Milk Stork boxes, zipties, and packing tape in order to fit a larger size refrigeration engine.

Reasons you might consider buying the boxes instead of a kit:

  • You need a bigger volume box (e.g., shipping over 120 oz)
  • You are comfortable re-using the boxes and are buying replacement refrigeration engines
  • You’re comfortable with some last minute MacGyvering in case of errors
  • You (and baby’s caretaker) really prefer Kiinde bags for feeding (our family does) and you need a larger box to fit more bags in

Since I use Kiinde bags, I used plastic shrink bands instead of tamper evident seal stickers.

Final thoughts and Shout-outs

I would like to give a shout-out to the other parents in our Parents ERG (employee resource group). I especially want to thank Renee, my parent buddy in our returning-parent buddy system, for her contributions to the research, and Marina from our People Team for setting up that buddy system and also for helping research policy, from company internal to FSA related questions. Jill for recommending the Baby Buddha pump, Dane for recommending Milk Stork, Amy for keeping not just one but two nice mother's rooms in the SF office to keep up with our demand, and Nicole who always lets me borrow ice packs when I forget mine at home. And thank you Rebecca and all the other parents who trod down this path before me. Every time more parents take on the challenges, we collectively increase the demand for the products and services that makes the challenges easier, and maybe a version of this post in 2025 will be a piece of cake. (See career opportunities at Cloudflare≫ )

We have come such a long way from the days of shipping frozen breastmilk packed with dry ice. I am so grateful that I was not out trying to source dry ice in a country where I don’t speak the language.

Last but not least, I want to thank my husband and my mother in law whose backs and wrists strain under the weight of our very large baby since I have been recovering from a wrist injury.

Further Reading

I count myself lucky to be able to stand on the shoulders of giants, that is, all the parents who have gone on this adventure before me who have shared their wisdom.

Reference Guides:

Articles (and a few notes that I took whilst reading them):

Check with hotel on fridge temperature. Bringing my own small thermometers.

Pump 45 minutes before landing so you have enough time to make it to your hotel in a variety of traffic conditions.

Avoid airplane water for washing your hands; use hand sanitizer, pump wipes, and store parts in your cooler. Ask for a microwave to steam sterilize parts. Bring steam sterilizer bag.

Mark item clearly as perishable. Large planes have some type of refrigerator that you can use to refrigerate your cooler. Smaller planes can provide extra ice.

Categories: Technology

How we used our new GraphQL Analytics API to build Firewall Analytics

CloudFlare - Thu, 12/12/2019 - 15:41
How we used our new GraphQL Analytics API to build Firewall AnalyticsHow we used our new GraphQL Analytics API to build Firewall Analytics

Firewall Analytics is the first product in the Cloudflare dashboard to utilize the new GraphQL Analytics API. All Cloudflare dashboard products are built using the same public APIs that we provide to our customers, allowing us to understand the challenges they face when interfacing with our APIs. This parity helps us build and shape our products, most recently the new GraphQL Analytics API that we’re thrilled to release today.

By defining the data we want, along with the response format, our GraphQL Analytics API has enabled us to prototype new functionality and iterate quickly from our beta user feedback. It is helping us deliver more insightful analytics tools within the Cloudflare dashboard to our customers.

Our user research and testing for Firewall Analytics surfaced common use cases in our customers' workflow:

  • Identifying spikes in firewall activity over time
  • Understanding the common attributes of threats
  • Drilling down into granular details of an individual event to identify potential false positives

We can address all of these use cases using our new GraphQL Analytics API.

GraphQL Basics

Before we look into how to address each of these use cases, let's take a look at the format of a GraphQL query and how our schema is structured.

A GraphQL query is comprised of a structured set of fields, for which the server provides corresponding values in its response. The schema defines which fields are available and their type. You can find more information about the GraphQL query syntax and format in the official GraphQL documentation.

To run some GraphQL queries, we recommend downloading a GraphQL client, such as GraphiQL, to explore our schema and run some queries. You can find documentation on getting started with this in our developer docs.

At the top level of the schema is the viewer field. This represents the top level node of the user running the query. Within this, we can query the zones field to find zones the current user has access to, providing a filter argument, with a zoneTag of the identifier of the zone we'd like narrow down to.

{ viewer { zones(filter: { zoneTag: "YOUR_ZONE_ID" }) { # Here is where we'll query our firewall events } } }

Now that we have a query that finds our zone, we can start querying the firewall events which have occurred in that zone, to help solve some of the use cases we’ve identified.

Visualising spikes in firewall activity

It's important for customers to be able to visualise and understand anomalies and spikes in their firewall activity, as these could indicate an attack or be the result of a misconfiguration.

Plotting events in a timeseries chart, by their respective action, provides users with a visual overview of the trend of their firewall events.

Within the zones field in the query we’ve created earlier, we can query our firewall event aggregates using the firewallEventsAdaptiveGroups field, providing arguments to limit the count of groups, a filter for the date range we're looking for (combined with any user-entered filters), and a list of fields to order by; in this case, just the datetimeHour field that we're grouping by.

Within the zones field in the query we created earlier, we can further query our firewall event aggregates using the firewallEventsAdaptiveGroups field and providing arguments for:

  • A limit for the count of groups
  • A filter for the date range we're looking for (combined with any user-entered filters)
  • A list of fields to orderBy (in this case, just the datetimeHour field that we're grouping by).

By adding the dimensions field, we're querying for groups of firewall events, aggregated by the fields nested within dimensions. In this case, our query includes the action and datetimeHour fields, meaning the response will be groups of firewall events which share the same action, and fall within the same hour. We also add a count field, to get a numeric count of how many events fall within each group.

query FirewallEventsByTime($zoneTag: string, $filter: FirewallEventsAdaptiveGroupsFilter_InputObject) { viewer { zones(filter: { zoneTag: $zoneTag }) { firewallEventsAdaptiveGroups( limit: 576 filter: $filter orderBy: [datetimeHour_DESC] ) { count dimensions { action datetimeHour } } } } }

Note - Each of our groups queries require a limit to be set. A firewall event can have one of 8 possible actions, and we are querying over a 72 hour period. At most, we’ll end up with 567 groups, so we can set that as the limit for our query.

This query would return a response in the following format:

{ "viewer": { "zones": [ { "firewallEventsAdaptiveGroups": [ { "count": 5, "dimensions": { "action": "jschallenge", "datetimeHour": "2019-09-12T18:00:00Z" } } ... ] } ] } }

We can then take these groups and plot each as a point on a time series chart. Mapping over the firewallEventsAdaptiveGroups array, we can use the group’s count property on the y-axis for our chart, then use the nested fields within the dimensions object, using action as unique series and the datetimeHour as the time stamp on the x-axis.

How we used our new GraphQL Analytics API to build Firewall AnalyticsTop Ns

After identifying a spike in activity, our next step is to highlight events with commonality in their attributes. For example, if a certain IP address or individual user agent is causing many firewall events, this could be a sign of an individual attacker, or could be surfacing a false positive.

Similarly to before, we can query aggregate groups of firewall events using the firewallEventsAdaptiveGroups field. However, in this case, instead of supplying action and datetimeHour to the group’s dimensions, we can add individual fields that we want to find common groups of.

By ordering by descending count, we’ll retrieve groups with the highest commonality first, limiting to the top 5 of each. We can add a single field nested within dimensions to group by it. For example, adding clientIP will give five groups with the IP addresses causing the most events.

We can also add a firewallEventsAdaptiveGroups field with no nested dimensions. This will create a single group which allows us to find the total count of events matching our filter.

query FirewallEventsTopNs($zoneTag: string, $filter: FirewallEventsAdaptiveGroupsFilter_InputObject) { viewer { zones(filter: { zoneTag: $zoneTag }) { topIPs: firewallEventsAdaptiveGroups( limit: 5 filter: $filter orderBy: [count_DESC] ) { count dimensions { clientIP } } topUserAgents: firewallEventsAdaptiveGroups( limit: 5 filter: $filter orderBy: [count_DESC] ) { count dimensions { userAgent } } total: firewallEventsAdaptiveGroups( limit: 1 filter: $filter ) { count } } } }

Note - we can add the firewallEventsAdaptiveGroups field multiple times within a single query, each aliased differently. This allows us to fetch multiple different groupings by different fields, or with no groupings at all. In this case, getting a list of top IP addresses, top user agents, and the total events.

How we used our new GraphQL Analytics API to build Firewall Analytics

We can then reference each of these aliases in the UI, mapping over their respective groups to render each row with its count, and a bar which represents the proportion of total events, showing the proportion of all events each row equates to.

Are these firewall events false positives?

After users have identified spikes, anomalies and common attributes, we wanted to surface more information as to whether these have been caused by malicious traffic, or are false positives.

To do this, we wanted to provide additional context on the events themselves, rather than just counts. We can do this by querying the firewallEventsAdaptive field for these events.

Our GraphQL schema uses the same filter format for both the aggregate firewallEventsAdaptiveGroups field and the raw firewallEventsAdaptive field. This allows us to use the same filters to fetch the individual events which summate to the counts and aggregates in the visualisations above.

query FirewallEventsList($zoneTag: string, $filter: FirewallEventsAdaptiveFilter_InputObject) { viewer { zones(filter: { zoneTag: $zoneTag }) { firewallEventsAdaptive( filter: $filter limit: 10 orderBy: [datetime_DESC] ) { action clientAsn clientCountryName clientIP clientRequestPath clientRequestQuery datetime rayName source userAgent } } } } How we used our new GraphQL Analytics API to build Firewall Analytics

Once we have our individual events, we can render all of the individual fields we’ve requested, providing users the additional context on event they need to determine whether this is a false positive or not.

That’s how we used our new GraphQL Analytics API to build Firewall Analytics, helping solve some of our customers most common security workflow use cases. We’re excited to see what you build with it, and the problems you can help tackle.

You can find out how to get started querying our GraphQL Analytics API using GraphiQL in our developer documentation, or learn more about writing GraphQL queries on the official GraphQL Foundation documentation.

Categories: Technology

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

CloudFlare - Thu, 12/12/2019 - 15:41
 exactly the data you need, all in one place exactly the data you need, all in one place

Today we’re excited to announce a powerful and flexible new way to explore your Cloudflare metrics and logs, with an API conforming to the industry-standard GraphQL specification. With our new GraphQL Analytics API, all of your performance, security, and reliability data is available from one endpoint, and you can select exactly what you need, whether it’s one metric for one domain or multiple metrics aggregated for all of your domains. You can ask questions like “How many cached bytes have been returned for these three domains?” Or, “How many requests have all the domains under my account received?” Or even, “What effect did changing my firewall rule an hour ago have on the responses my users were seeing?”

The GraphQL standard also has strong community resources, from extensive documentation to front-end clients, making it easy to start creating simple queries and progress to building your own sophisticated analytics dashboards.

From many APIs...

Providing insights has always been a core part of Cloudflare’s offering. After all, by using Cloudflare, you’re relying on us for key parts of your infrastructure, and so we need to make sure you have the data to manage, monitor, and troubleshoot your website, app, or service. Over time, we developed a few key data APIs, including ones providing information regarding your domain’s traffic, DNS queries, and firewall events. This multi-API approach was acceptable while we had only a few products, but we started to run into some challenges as we added more products and analytics. We couldn’t expect users to adopt a new analytics API every time they started using a new product. In fact, some of the customers and partners that were relying on many of our products were already becoming confused by the various APIs.

Following the multi-API approach was also affecting how quickly we could develop new analytics within the Cloudflare dashboard, which is used by more people for data exploration than our APIs. Each time we built a new product, our product engineering teams had to implement a corresponding analytics API, which our user interface engineering team then had to learn to use. This process could take up to several months for each new set of analytics dashboards.

...to one

Our new GraphQL Analytics API solves these problems by providing access to all Cloudflare analytics. It offers a standard, flexible syntax for describing exactly the data you need and provides predictable, matching responses. This approach makes it an ideal tool for:

  1. Data exploration. You can think of it as a way to query your own virtual data warehouse, full of metrics and logs regarding the performance, security, and reliability of your Internet property.
  2. Building amazing dashboards, which allow for flexible filtering, sorting, and drilling down or rolling up. Creating these kinds of dashboards would normally require paying thousands of dollars for a specialized analytics tool. You get them as part of our product and can customize them for yourself using the API.

In a companion post that was also published today, my colleague Nick discusses using the GraphQL Analytics API to build dashboards. So, in this post, I’ll focus on examples of how you can use the API to explore your data. To make the queries, I’ll be using GraphiQL, a popular open-source querying tool that takes advantage of GraphQL’s capabilities.

Introspection: what data is available?

The first thing you may be wondering: if the GraphQL Analytics API offers access to so much data, how do I figure out what exactly is available, and how I can ask for it? GraphQL makes this easy by offering “introspection,” meaning you can query the API itself to see the available data sets, the fields and their types, and the operations you can perform. GraphiQL uses this functionality to provide a “Documentation Explorer,” query auto-completion, and syntax validation. For example, here is how I can see all the data sets available for a zone (domain):

 exactly the data you need, all in one place

If I’m writing a query, and I’m interested in data on firewall events, auto-complete will help me quickly find relevant data sets and fields:

 exactly the data you need, all in one placeQuerying: examples of questions you can ask

Let’s say you’ve made a major product announcement and expect a surge in requests to your blog, your application, and several other zones (domains) under your account. You can check if this surge materializes by asking for the requests aggregated under your account, in the 30 minutes after your announcement post, broken down by the minute:

{ viewer { accounts (filter: {accountTag: $accountTag}) { httpRequests1mGroups(limit: 30, filter: {datetime_geq: "2019-09-16T20:00:00Z", datetime_lt: "2019-09-16T20:30:00Z"}, orderBy: [datetimeMinute_ASC]) { dimensions { datetimeMinute } sum { requests } } } } }

Here is the first part of the response, showing requests for your account, by the minute:

 exactly the data you need, all in one place

Now, let’s say you want to compare the traffic coming to your blog versus your marketing site over the last hour. You can do this in one query, asking for the number of requests to each zone:

{ viewer { zones(filter: {zoneTag_in: [$zoneTag1, $zoneTag2]}) { httpRequests1hGroups(limit: 2, filter: {datetime_geq: "2019-09-16T20:00:00Z", datetime_lt: "2019-09-16T21:00:00Z"}) { sum { requests } } } } }

Here is the response:

 exactly the data you need, all in one place

Finally, let’s say you’re seeing an increase in error responses. Could this be correlated to an attack? You can look at error codes and firewall events over the last 15 minutes, for example:

{ viewer { zones(filter: {zoneTag: $zoneTag}) { httpRequests1mGroups (limit: 100, filter: {datetime_geq: "2019-09-16T21:00:00Z", datetime_lt: "2019-09-16T21:15:00Z"}) { sum { responseStatusMap { edgeResponseStatus requests } } } firewallEventsAdaptiveGroups (limit: 100, filter: {datetime_geq: "2019-09-16T21:00:00Z", datetime_lt: "2019-09-16T21:15:00Z"}) { dimensions { action } count } } } }

Notice that, in this query, we’re looking at multiple datasets at once, using a common zone identifier to “join” them. Here are the results:

 exactly the data you need, all in one place

By examining both data sets in parallel, we can see a correlation: 31 requests were “dropped” or blocked by the Firewall, which is exactly the same as the number of “403” responses. So, the 403 responses were a result of Firewall actions.

Try it today

To learn more about the GraphQL Analytics API and start exploring your Cloudflare data, follow the “Getting started” guide in our developer documentation, which also has details regarding the current data sets and time periods available. We’ll be adding more data sets over time, so take advantage of the introspection feature to see the latest available.

Finally, to make way for the new API, the Zone Analytics API is now deprecated and will be sunset on May 31, 2020. The data that Zone Analytics provides is available from the GraphQL Analytics API. If you’re currently using the API directly, please follow our migration guide to change your API calls. If you get your analytics using the Cloudflare dashboard or our Datadog integration, you don’t need to take any action.

One more thing....

In the API examples above, if you find it helpful to get analytics aggregated for all the domains under your account, we have something else you may like: a brand new Analytics dashboard (in beta) that provides this same information. If your account has many zones, the dashboard is helpful for knowing summary information on metrics such as requests, bandwidth, cache rate, and error rate. Give it a try and let us know what you think using the feedback link above the new dashboard.


Categories: Technology

PHPWales 2020 - June 3rd to June 4th

PHP - Thu, 12/12/2019 - 08:04
Categories: Technology

Pages

Subscribe to oakleys.org.uk aggregator - Technology
Additional Terms