Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google Open Source Code Search (opensource.google)
596 points by habosa on March 11, 2020 | hide | past | favorite | 167 comments


Seems like this is based on Google's own internal code search tooling, something most engineers at Google rely on for every day code-level work. I personally can't even begin to imagine how I'd navigate the gigantic codebase without it.

(I work at Google)


I had this same fear as I left Google, but it turns out that a great ide (eg anything by jetbrains) can take you quite far.

It's a different work flow, but you simply don't need cs/ when your code base is orders of magnitude smaller


Would you know if the Google cloud product for hosting git projects [1] uses the same underlying code search as the internal tool?

[1] https://cloud.google.com/source-repositories


It is the same tool.


It’s also used for https://source.chromium.org. I now host my monorepo on Cloud Source Repositories because it has a super nice integration with the rest of their products.


If you crack open developer tools and watch the API requests go by, you'll be able to confirm that it's the same thing :)


Not really, this is a pale shadow of what the real CodeSearch inside Google does. I really wish the external ones had even 1/10th the functionality.


Help https://github.com/TreeTide/underhood reach feature parity ;)


what's 9/10th that internal one has?


You'd use Sourcegraph, probably.


what is the constant phone-home activity on that opaque container they send as SourceGraph.. It is occassionally the case that devs have too-fast machines, so their code isn't seen on ordinary equipment. With SourceGraph and other inner-network-devs tools, the amount of chatty traffic and build dependancies seems seriously off-putting, trending to useless with ordinary network.


This seems like a bad attitude. Perhaps you could constructively ask for a sourcegraph-lite that does less, in return for less deps / networking complexity?


I am a dev at Sourcegraph, I'd be very open to any feedback.

You can firewall off Sourcegraph 100% for complete confidence, and aside from the first admin's email address (so we can notify them of any security updates) we only send back aggregated anonymous usage statistics which we are extremely transparent about: https://docs.sourcegraph.com/admin/pings

We sell developer tools, not user data.


Maybe the problem is that to disable telemetry you have to contact support and pray they like you, instead of giving a configuration option.


Ah, yeah, that's fair. I'll forward this feedback onto the team.

The option is documented in our config docs, though, and also appears in the config editor's autocomplete in the app if you type `telemetry`, though, so it's not really a secret https://docs.sourcegraph.com/admin/config/site_config#disabl...


The property is “disableNonCriticalTelemetry“ [0] that seems curiously named to me.

What kind of telemetry is critical? How do I disable that?

[0] https://docs.sourcegraph.com/admin/config/site_config#disabl...


You are 100% correct, I really messed up here by suggesting that option. I misread our own docs. It would only disable event counts from being sent (e.g. instead of "how many jump-to-definitions were performed in a day?" we would just send a boolean "did one or more jump-to-definition occur in a day?" based on my reading of the code[1]) -- not what I thought it did. Will send a PR to clarify the docs on this so I don't mess up like this again..

I'm human and screw up, frequently; this instance just happened to be on the ridiculously important topic of privacy -- hopefully you will forgive me for that, I wasn't trying to be malicious but certainly in retrospect I can see this being interpreted as such.. :/

The right option to turn it all off is just this one, since we only send ping data as part of the version update check you disable that and it's all off. And you can confirm this in the code as I just did here[2][3]: https://docs.sourcegraph.com/admin/config/site_config#update... And as I mentioned previously you can always firewall off Sourcegraph 100%.

As an aside, I can promise you that I wouldn't have continued to work at Sourcegraph for the last 5 years if I thought our business was selling or collecting identifiable user data in ANY form. We only collect just enough information to help prioritize what features we improve and (aside from the first admin's email as I noted already above) it is all 100% anonymous and aggregated numbers that we are extremely transparent about[4]. Our person running analytics is also constantly trying to make this more transparent[5] because we all are very security and privacy aware and know the #1 way to convince people to not run software is to make them think you are spying on them or using their data in ways they would not want.

It's obvious to me this should be more clear in our docs, I'm going to forward all of this conversation onto the rest of our team to make sure we improve our docs here.

[1] https://sourcegraph.com/search?q=repo:%5Egithub%5C.com/sourc...

[2] https://sourcegraph.com/github.com/sourcegraph/sourcegraph@f...

[3] https://sourcegraph.com/github.com/sourcegraph/sourcegraph@f...

[4] https://docs.sourcegraph.com/admin/pings

[5] https://github.com/sourcegraph/sourcegraph/pull/8930#issueco...


Just commenting here to thank you for all the sincerity, transparency and attentiveness in the comments.


How long do you estimate we will be able to use this before google inevitably kills the project?


Shameless plug: https://github.com/TreeTide/underhood is a work in progress UI over Kythe indices (the same indices that power Code Search).

If you already know how to index, this is a completely open source alternative, likely with less bells and whistles.

I worked at Google and miss Code search. But I have lots of ideas as well how one can go beyond the status quo for code reading and debugging. Join if interested.


Googler here. We have the same Code Search tool internally, this is honestly one of my favorite things about working at Google. Great to see this open sourced.


This is missing tons of functionality and layers that the internal one has tho, like all of the automatic code analysis and linting, coverage and fuzzing integration, etc


How could it though? Without (bazel) BUILD files, it couldn't even know how to build everything.


How would the open version support things like the hot functions layer? It makes sense these are missing.


Some of those features are available with Cloud Source Repositories now too.


It hasn’t been open sourced, right?


Correct. This is just exposing indexed versions of some of our larger open source projects with this code search.

Chromium also has its code indexed by the older version of this tool: https://cs.chromium.org/


> Chromium also has its code indexed by the older version of this tool: https://cs.chromium.org/

Chromium recently switched over to a new version of code search.


They seem to have reduced the information density and killed readability as well with the "material ui" redesign. The old code search UI was perfect, with enough contrast to allow you to quickly grep through xrefs to locate relevant entries. The page itself was lightweight as well.

Compare that to the redesign where each xref jump has me staring at a spinner for half a second, the xref bar has no visual separation between type, filename, and code snippet, all buttons visually indistinguishable with a blue on white color scheme, the "layers" dropdown is replaced with a mishmash of buttons scattered across the layout, etc.

I really hope they're not forcing this abhorrent redesign on their developers as well.


Specifically, the new version is at https://source.chromium.org/, and the Android counterpart is https://cs.android.com/.


I guess source.android.com was taken :)


The site itself has not been, but the library used to build the map of cross-references is https://kythe.io/


Confusing. What exactly is special about this code search system? Seems common for internal code search



Also better than Amazon's internal tool.


Can't fathom a Google thing being great ?


Really loved this interface (also cs.chromium.org) while I worked at Google. It was easy for me to orient myself, find what uses this and that, where it's being used, and then it had whole "debugging" facility:

You select your binary on borg (think kubernetes/docker), and it'll fetch from the binary with which CL (think like perforce "CL") it was built, and/or additional cherrypicked CL's, then it'll somehow go back in-time and represent how the source code looked then.

later one can (I tried it in Java, but I believe it's available for other languages too), you can inject statements right around the begining of function (a way of breakpoint), and that statement can be something like - let's log how this function was called - you were able to reference nearby statements. This could be set from the command-line, and took a bit mastery (and was bit afraid first time using it, or more like had chilling effect on me), but then my task (with 10 or 11 instances) reported these log lines, and I was able to see them in the browser.

(I have no experience with GCP, or the public face of Google Cloud, so I don't know what's available there), but this was freakin cool.


Misread as "Google Open Sources Code Search" :'(

cs.opensource.google is amazing


There's this https://github.com/google/zoekt. It's pretty light on features, but dang if it isn't fast and precise.


thanks!

If you install sourcegraph, you get the same btw. Sourcegraph indexed search is powered by zoekt.


zoekt is absolutely awesome, I was reading it’s code yesterday to figure out how it does ngram indexing and search.


I found this on github in Google's org : https://github.com/google/codesearch


That "codesearch" is only superficially related to this one. The main feature of _this_ codesearch that makes it so useful is the cross references to callers, callees, and overrides. Ye olde codesearch has more in common with things like livegrep.


> The main feature of _this_ codesearch that makes it so useful is the cross references to callers, callees, and overrides. Ye olde codesearch has more in common with things like livegrep.

However this part of internal codesearch is the one part that is actually (partially) open sourced: kythe.io


What's the internal codesearch tool written in?


yeah, title is misleading.

https://cs.chromium.org/ is the next best thing.


"Google Open Sources Search Code"... wouldn't that be something


Check out Debian Code Search https://codesearch.debian.net/


Working with chromium/v8, I can honestly say google's code search infra is one of the most valuable resources available. I really hope they open source the backend at some point.


The backend is open sourced, it is Kythe.io. It supports go, c++, java out of the box, for some definition of out of the box. Maybe even typescript. Also cross-references protobufs work generated code of you make the stars align ;)

As for UI, treetide/underhood I mention elsewhere is the only open option now.

But Kythe comes with command line utils and an API you can query directly as well.

What is missing from the open source is a production-ready parallel serving table builder. There is one in golang which uses Apache Beam, but last time I checked the go workers are not well supported on the Flink runner. It didn't even work properly on the GCP runner. Hope this would change.


Question for Googlers or others: What do you think is the most well-written piece of software produced by Google? I would like to study how the world's best engineers write code. (Preferably C++, as it's the language i'm most familiar with)


The one that people at Google who are the keepers of C++ code quality standards maintain themselves is Abseil.

https://cs.opensource.google/kythe/kythe/+/master:external/c...


By necessity, Abseil is full of dark template magic that would very rarely be used elsewhere in the codebase. That's the point - it encapsulates a lot of useful abstractions and allows them to be used without the client code author thinking about the guts of the abstraction. But it makes it pretty unusual relative to typical Google C++.


True for much of it, but if you look at something like cord.h, it's almost free of template programs. Google C++ application code isn't all that spiffy, to be honest. I would say most of the code is dedicated to stuff that nobody outside of Google is going to care about. I think the base libraries are more interesting.


LevelDB is a widely used C++ project, written by the same people who wrote core parts of Google Search 20 years ago:

https://github.com/google/leveldb


> study how the world's best engineers write code

Note that you'd only be seeing the final result, not the whole process by studying source code. Also, I'd say definition of good code varies by domain.


Given how version control works, source code is the area of human endeavor for which your first sentence is the least true.


I think base:: our "std library" is pretty well written and portable.


Glad to see google open sourced this. I also implemented code search in my open source project OneDev (https://github.com/theonedev/onedev). To try it, please visit https://code.onedev.io/projects/android-framework-base/. Press "t" for quick symbol search, and "v" for advanced search with regular expression support.

You may also hover mouse over a symbol to find its declaration and occurrences.


Nice. I'm grateful for this being posted on HN, because discoverability of that page seems to be zero (I couldn't find any link to it from opensource.google. It doesn't even have page title so googling it would be more complicated too.


Google used to have code search before (or at least far better than) anyone else did. Then they killed it.


I misread an thought it said Google Open Sources Code Search, hoping they open sourced https://en.wikipedia.org/wiki/Google_Code_Search


Yeah, I'm super confused why suddenly now they provide a public code search service again. Uh, capricious overlords :/


Code search is a great tool! It really helps with productivity! But sometimes it is very easy to go down the rabbit hole.

I wonder if they can open source code search itself.


I don't know the state of it, but Kythe is open source: https://kythe.io/

But in reality you probably want something more like SourceGraph which packages everything up nicely so that you don't need to worry about it, or something more specialized.


Bringing up kythe from scratch is very daunting. First thing I did when I left google, of course, but still really hard.


What did you use for a UI?


That's one of the main problems. I started with the demo frontend that comes in the box and just hacked on that. I'm no UI developer by any stretch.


Try TreeTide


cs.opensource.google runs on Google's search infrastructure, so it's unlikely to be open sourced. https://github.com/google/zoekt is open source, but lacks cross-referencing, and has a more spartan UI.


If you want to do code search on your private git, mercurial, svn, cvs and other repositories try a fully open source opengrok (https://github.com/oracle/opengrok).

It’s easy to self install and use, with good documentation with added bonus very fast.


Sorry to be critical, but can someone explain to me the benefits of having code search this powerful?

Surely code is easier to explore in an IDE which understands the context and dependencies of the project... This just seems like a glorified "find"


If you think about Google's codebase size, an IDE wouldn't cut it. You could load and analyze dependencies/imports as you go, but that would make for a terrible user experience (think about IntelliJ indexing task every time you want to check the definition of something).

Also, Code Search has baked in a lot of goodies. History layer, cross-references, call sites, ... and it's snappy. Moreover, is really well integrated with all the other internal tools used for coverage, code analysis, issue tracking, web text editor, ... .

I think an IDE (like IntelliJ IDEA) can't reach that level of integration with several other systems unless you fully buy into the ecosystem a company like JetBrain proposes you (their issue tracker, their code review tool, ...).

So, summarizing, it's a tool made by Googlers for Googlers' needs and it's amazing using it every day for all the above reasons.


You can search for non explicit dependencies. e.g. if you're removing a command line flag in a C++ binary. You can search for all uses of that flag for all users of your binary to make sure it is safe to remove.


It's handy for finding examples of real world use of some api. I usually use https://codesearch.debian.net/


I'm guessing the benefit is the ability to not have to bring up an IDE and download the code in cases where that isn't a quick option.


This name is misleading. The domain is okay.

It should probably be 'Google Code Search'. I would have expected Google to come up with a search engine for all Open Source code otherwise.


Nice to see GN there. I wish more people knew about it.

For me is as powerful as Bazel, but without the need for a JVM and all the insanity that comes with it in a desktop/dev environment.

The syntax is great, powerful (insane customization) and together with Ninja theres nothing like it.

Its in C++ and even being as powerful as Bazel, its a light, standalone library that can handle a huge amount of source code, dependencies, tools and configurations.


Having tried to battle GN configs... I don't agree.

I was working on a big source tree and got frustrated that it kept rebuilding files that hasn't changed just because I switched git branches to look at one file, and then suddenly "Yay, another 18 hour full rebuild!".

I tried to fix it and found there is no option to ignore file timestamps, and some guy has tried to patch it to do that[1]... But the patch requires putting an option in GN files which seems to break them wherever I put it... I tried to patch GN, but it wouldn't ever seem to pass that option through... Ended up patching Ninja to always have the option on, but then random other operations broke (like simple file copies).

A day wasted, and problem not solved. Maybe my use case isn't common, or a bad workman blames his tools, but for me at least it wasn't a nice experience.

[1]: https://github.com/ninja-build/ninja/issues/1459


sidebar question - Anyone know how they've made the interaction/animation on this page [1] ? Feel like it is a great way to show lot of info in a concise way.

[1] - https://opensource.google/projects/explore/featured


Agreed, it is a very nice little interaction! It seems like they're animating the bubbles around a circle while randomly fluctuating the speed and radius at which they rotate. Clicking on a bubble centers it by setting the rotation radius to `0` and expanding the size.

Would be interested to know how they expand the bubbles as your cursor moves closer.


We are building similar experiences for internal repos.

Demo: https://demo.repomono.com/cs/view.php

Code is here: https://github.com/repomono/cs


It looks like they haven't integrated the kythe cross reference DB as the symbols aren't clickable


A few projects have configured kythe for at least one language. See bazel, go, gvisor, kythe, and tensorflow.


First impression is that it enables discoverability of code across the open sourced Google projects, but trying to find this page even on Google search is not a thing yet. Is that intentional?


This is very useful to read and search TensorFlow source code. It definitely beats Github for me.


How strange - OpenDNS/Cisco Umbrella seems to flag the domain and gives me a 403 Forbidden.


Does anyone know the pros and cons of this vs something like Elastic search?


so far it doesn't seem to index a lot of stuff. I searched from some terms out of my kubernetes/openshift dependencies and it didn't find them. Is this correct?


Is this a frontend for Grok - the thing that Steve Yegge built?


For Java and JavaScript there's also codota.com


Nice, my team's project is on there! (Nomulus)


Sourcegraph CEO here. This is the same underlying code search offered for a while by Google Cloud Source Repositories for private code, and it’s cool to see this usable for Google’s own open-source code, too.

If you want to get universal code search for your own (private) code on any/all code hosts, Sourcegraph is easy to set up internally (self-hosted Docker install) at https://docs.sourcegraph.com/. Or you can get code search for all OSS projects at https://sourcegraph.com/search. More general info at https://about.sourcegraph.com.

Lots of Xooglers and current Googlers use Sourcegraph, too. Just mentioning Sourcegraph because I’ve seen several other folks mention us in the comments (thanks!).


Thanks. Sourcegraph seems better from my 1mn test ?

Following this article https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-go... recently featured on HN, I tried

https://cs.opensource.google/search?q=%2F%5E(%3F:(%5B%5E:%2F... : no result

https://sourcegraph.com/search?q=/%5E%28%3F:%28%5B%5E:/%3F%2... : 2 results !


Great product. Used it extensively at Uber.


Thank you! Any complaints/requests/other feedback? How could we make it better?


Yes! Publish a graphql schema that is parseable by apollo graphql please! I tried to use your API on my companies internal source graph setup and had to hand roll my API calls because of these errors.


Sorry to hear you this didn't work for you!

I filed an issue on sourcegraph/sourcegraph, would you mind posting more details there about the errors you ran into? https://github.com/sourcegraph/sourcegraph/issues/8970


Seems like this is in the works already, but the boolean operators in OpenGrok are so intuitive and powerful. I use them every single day and the lack of support in Sourcegraph immediately disqualified it for us. For example yesterday I was looking for Dropwizard Managed classes not annotated with @Singleton so I did:

Managed && !"@Singleton"

(I'm omitting the fully-qualified class name for brevity)

If I also wanted to look for HealthCheck classes I could update the query to:

(Managed || HealthCheck) && !"@Singleton"

I think it also helps that OpenGrok has a separate input for filtering file paths (completely splitting the "where" and the "what" parts of the query). And this file path search supports the same boolean operators. So if I want to narrow my search to two particular repositories I could put CrmSearch || AutomationPlatform into the File Path input. And because this input only handles file paths, I don't need to remember any special syntax. Whereas if you clump the entire query into a single input, then users need a way to tell you whether a search term applies to file paths or file contents.


Engineer at Sourcegraph here. Adding boolean operators is a priority on our roadmap, and expected to go live between May and July this year. On separate inputs: definitely something we've also identified and are actively working on. One recent experimental addition is "Interactive mode" that lets you enter patterns separately for repos, files, and patterns, and so on. There's a dropdown next to the query bar to try it out--there are some kinks, and we're currently working on making this a polished feature. Thanks for the feedback, and stay tuned!


Awesome, thanks for the response. Looking forward to trying out Sourcegraph again


I'm at FB now and we have an internal code search. I apologize, I can't recall any feedback for sourcegraph.


I've looked at this a few times but never given it a go. Going to actually try it this week for $DAYJOB.

One nice-to-have would be support for C, does the C++ extension work for C as well?

Thanks!


Yes, the extension supports C, C++, and C#! Sourcegraph supports over 30 languages out of the box using our basic code intelligence (search based heuristics and ctags).

Check out the recent release notes: https://about.sourcegraph.com/blog/sourcegraph-3.13#basic-co... And how out of the box code intelligence works: https://docs.sourcegraph.com/user/code_intelligence/basic_co...


Thanks for the info!


What’s the code licensed under? It’s not clear from your site at first glance.


https://github.com/sourcegraph/sourcegraph/blob/master/LICEN... says: "LICENSE.apache (Apache License) applies to all files in this repository, except for those in the enterprise/ and web/src/enterprise/ directories, which are covered by LICENSE.enterprise."

Thus, Apache 2.0 and some custom license requiring you to accept the terms, have a correct number of seats, and does not allow you to "copy, merge, publish, distribute, sublicense, and/or sell the Software."

I am not sure what all of this means, though. Better check out the licenses yourself :)


It's open core (Apache 2 + some non-OSS parts for enterprise features). All of the code is public and we develop in the open at https://github.com/sourcegraph/sourcegraph.


Sourcegraph only support git repository so it's not very useful for enterprise with mercurial, svn or other distributed version control systems.

There is another open source application for code search opengrok [1] (it's completely open source unlike sourcegraph and supports multiple version controls beside git).

Take a look. It's easy to install and operate on bare metal, cloud and containers, instead of convoluted sourcegraph way of kubernetes or docker.

[1] https://github.com/oracle/opengrok


You always can bridge to git from svn and mercurial. It is almost seamless, and after generating the git repository everything will work.


Many organizations don’t use or want to use git. This is another convoluted solution, trying to fit a square peg in round hole.

Another reason not to use sourcegraph is it’s proprietary (with some open source parts), unlike opengrok fully open source.


Is the sourcegraph "open core" unlike how redis is "open core", eg. The main code is open but there are paid, closed-source modules and extensions?


Sourcegraph is open core like how GitLab and VS Code are open core. You can run "Sourcegraph OSS" and get limited features, or you can run Sourcegraph (see https://docs.sourcegraph.com/#quickstart) and get all the features, but you need a license key when you hit the user limit.


Is there a variant of this that integrates with IDE of choice?


I really hate that some of the elements on the page are translated into a different language, seemingly based on my IP. When did it become acceptable to ignore my browser or my system language settings? The same thing happens on other Google services (like Google Groups), but I noticed this trend on other websites too.


This annoys me to no end. I live in Hong Kong. I speak English. We have 2 official languages here, one of which is English. I travel frequently to Japan as well, with infrequent trips to either Europe or North America.

My 'preferences' and settings are a total disaster. I end up having to go onto the gray market to buy gift cards and prepaid credit cards as I seemingly never can buy stuff online when I want to, as I'm either in the wrong place, or in the wrong language. But I know I'm still me.

What is with this '100% of people in this location read/speak the same?'

What if I want to learn Russian, but I'm in China? Why cant I just tell my computer to show me Russian, and the browser tells the site give me Russian if you have it?

Why is this so hard?

I really dislike things that try to make it easy for me, as all they do is prevent me from being able to function.


The actual mechanism your User Agent tells the server which languages you are interested in is even more robust [0] ! It is a weighted list of preferences.

The reasons I've heard from web developers on why they don't use this is because they believe that the user probably never set that up right, and that multiple people could be using the web browser so they need to be able to do the right thing.

What I typically do is select the best matching language from the Accept-Language HTTP header, and then override it with a session-specific value IF one is supplied. Example:

1. https://rkeene.org/projects/6to4/

2. https://rkeene.org/projects/6to4/?lang=fr

You can see PART of the problem here from the web developers perspective. This isn't a negotiation so you have no way to know which languages the server supports. If your preferences aren't totally inclusive you'll get something "wrong". This can be solved by exposing that information (as not done above) and allowing the user to override it (as above).

[0] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ac...


The problem is that every big website operator wants to make it work correctly for you, and they (1) have different definitions for success and (2) assume you're incompetent.

In the first point, there's someone with a requirements document that assumes every country has one official language and everyone in that country speaks that language, and so feels successful and internationalization-ready when a geo-IP served page is automatically switched to the "correct" language (much like "Falsehoods programmers believe about names").

Second, configuring a computer's locale to set a browser's request headers correctly is beyond the technical expertise of many users. It would be better if things were consistent, but at the point where some locales were set incorrectly and some were uniquely set intentionally your analytics would have showed that you improved the situation on average by trying to guess the locale (screwing over users who knew how to use their computer) than by respecting it and eventually getting everyone to understand how to set their desired language.


if you install a hungarian firefox, the accept language header will reflect this (or it did when i tried it last time). Non-expert users also often choose software in their language mutation. I dont have numbers but i wouldnt be surprised if a lot of browsers were sending correct accept language headers.

i dont know IE but it was in a very good position to guess the language of the user as well.


Agreed, but then they should give the users a way of overriding that. Google in particular knows what users want because they force you to answer it when you create your account. I've set every single location setting to UK, and yet when I travel abroad they insist of ignoring it. No excuse there.


> based on my IP

heh

in my case I wish it was based on my IP

but for some unbeknownst reason, some components of the page are in Russian. I'm not in Russia; nothing in my browser request indicates I'd like to read Russian.


I live in the US and have a local US IP. A year or so ago, I made a site for a side project using vanilla HTML. No frameworks, no JS. Every word on the page was in English and could be found in an English dictionary.

When I first stood the site up and tested it, Chrome would always break in as soon as it loaded, with a popup to translate the site into _English_ from _Romanian_!

I was able to suppress this only by turning on every single language hint in META.


Maybe your IP is located in russia according to the geo IP database they are using.


I live in Switzerland and it is really annoying, as my IP switches between the German and the French part every few weeks.

I also wonder, what is happening in places, where people traditionally where always from different language groups. I don't know if then there is always a single common language.


Great part of browsing internet from Swtizerland is seeing webpages that are partially German, French and Italian. Even when set to english :D


same issue for me :(


Add the "hl=en" param to the URL, works for almost all Google services:

https://cs.opensource.google/?hl=en


Have you tried changing your accept language ? IE, it is not based strictly on your IP, but also on what your browser asks for.

I'm based in France, with a french IP, but my browser language header is set as ACCEPT-LANGUAGE en-US,en;q=0.9,fr-FR;q=0.8,fr;q=0.7

And this page is fully in english.


My headers are set to en-US,en;q=0.9 (no other languages) on both Chrome and Firefox and the page is partially translated on both.


It's IP only.


No it's definitely a mix, like my exemple in parent comment describe. If I change my accept header to remove EN priority I get the french translation that matches my french IP.


I get this extremely annoying behaviour from many sites (sometimes without even an easy option to switch), including Google Search but not here.


For me, on the main page it's mostly tooltips (More Elements in the navbar, Help in the searchbar) and the blue Show Project link. It's worse on the project pages where the description is in English, but the entire table (including dates) is localized.


Same here. Seems like a bug.


I find this extremely frustrating as well, across the various Google services I use.


Hasn't that been acceptable since the dawn of the web?

Lots of people log errors to some sort of monitoring system. I can't remember seeing any localisation/translation API that would log an error rather than just silently serve English. I infer from this that just serving English is universally accepted and considering it an error is so rare that I've yet to see an API that caters to it.


I see someone downvoted and today'a s frustrating day, so let me ask for more.

English is basically the world's default language (like it or not). Sites that translate partially, but sometimes show English text instead of the language specified by the browser expect the user to understand the world's default.

A language inferred from geoip is the user's area's default language. Sites that show that language instead of that specified by the browser expect the user to understand the area's default.

These two behaviours seem really quite similar to me. Their technical backgrounds differ, but the resulting behaviour is much the same. One has been widely accepted since the dawn of the web, AFAICT, which leads me to believe that the other has been just as acceptable for just as long. And thus, my answer to "when did it become acceptable" is "it always was, you just didn't notice".


it's probably not your IP but your browser settings that are taken from the OS.


It's part of the "localization" push by governments, news/media, some consumers and tech companies themselves. I guess it's okay for passive consumers, but for tech or advanced/active consumers, it's annoying. I'm pretty sure most major sites/apps/etc all localize. So your google search results, youtube frontpage, etc will be different based on your location.


Localization is not a problem; not giving users control over their locale is. If you travel to a foreign country and suddenly can't read anything on the websites you regularly visit, that's pretty bad. If multiple languages are spoken in your country and you're forced to use one you don't speak, that's bad as well. Websites should never assume they know better which locale their users want than the users themselves.


> Localization is not a problem;

I didn't say it was a problem for most users ( aka passive users ). I said it's a problem because they make it difficult/impossible for tech/active/advanced users to switch it.

> Websites should never assume they know better which locale their users want than the users themselves.

Yes. You just restated my comment. Not allowing tech/active/advanced users the option is the problem. I love comments that appear to debunked what instead you wrote but just write it in a different way and pretend it is new.


Maybe you think that what I wrote was already implicit in your comment, but I'm still not seeing it, and since you got downvoted, evidently a few others felt the same. Next time you'd probably better write it out explicitly.


> but I'm still not seeing it

Still?

> and since you got downvoted, evidently a few others felt the same.

This has got to be the saddest thing I've ever seen on a forum.

> Next time you'd probably better write it out explicitly.

Okay I'll give it another shot.

https://news.ycombinator.com/item?id=22563157

Let me know if that cleared things up for you.


I hope I get to work for Google sometime.


Just apply. Google is actively hiring all year round.


I am a grad student right now with 2 years of industry experience. Google still prefers people who are extremely good at Data Structures and Algorithms. I like doing them, but not so much to just grind them for the sake of getting into Google. I like to learn how to design big systems and grinding Data Structures and Algorithms seems like a waste of time.


I put in "only" 40 hours of refreshing on data structures and algorithms, and doing some practice coding problems, in the weeks leading up to my interview. And I got the job.

Frankly, it's been the best hourly return on investment of anything I've done in my life up to this point, by far. Assuming I wouldn't have gotten the job otherwise (which seems reasonable), each of those hours spent studying has proven to be worth several tens of thousands of dollars. I'm not exaggerating; I just did the math.

Maybe the interviewing process is broken or sub-optimal or whatever, but it is what it is, and if you can get through it by doing some additional studying, then it's absolutely worth it. Google is a good place to work on designing big systems, so if that's your interest, consider just putting in the work.


This is a solid advice. Thanks! I will try to dedicate a portion of my day to brusing up Data Structures and Algorithms and maybe, eventually, I will get good enough to crack the interview.


Go for it. The worst they can say is no, in which case you can come back in a year.


Yeah, but they need people good at Data Structures and Algorithms. I think I am above average but nowhere near the quality of people that they hire. Also, I am more interested in designing big systems end-to-end and feel that doing a lot of Data Structures and Algorithms is a waste of time.


Tell that to a recruiter. It's their job to find jobs that are a good fit.


Yeah, but the recruiters don't respond.


Then apply for SRE.


I'm just gonna leave this here: https://killedbygoogle.com/


Notably, they killed Google Code Search: https://en.wikipedia.org/wiki/Google_Code_Search

IIRC, Google Code Search was the impetus for creating the RE2 library.


How is this contributing to the discussion. Do you feel cool now after a snarky comment ?


Do you?


I am pointing out what he did. And yes I do feel cool for pointing out something egregious.


It's reasonable to warn less-experienced people about Google's historical behavior.

People build their businesses on Google products and services like this, not realizing the 90%+ mortality rate over 5 years or so.


It's reasonable for a big company to try and fail. When they don't try anything 'they're not innovative'. When they do..'oh look they failed'.


That's a bit of an oversimplification. Google's failures often garner more marketplace traction than other companies' wildest success stories.

But at the end of the day they are an advertising company. Any product that doesn't help them sell advertising -- and lots of it -- will eventually impede the progress of the careers of the managers and employees who work on it. That's when the axe falls... not when a product "fails," necessarily, but when it's no longer "sexy."


Yes just like Waymo a 10 year bet is for advertising.


Waymo is (a) no longer affiliated with Google, and (b) basically a hobby project. It is comparable to Blue Origin for Bezos, any of Musk's numerous speculative ventures in fields from tunnel-digging to AI, or the original AppleTV for Steve Jobs.

People who work there are very well aware of that, and are OK with it. No one outside the company should allow their own business or career path to depend on Waymo, at this stage. It could vanish tomorrow at the whims of the Alphabet execs and/or directors, because their own business doesn't depend on it.


Lol Google was one company till 2015. Alphabet and Google have the same CEO. Just shows your biases in trying to prove your point. It was Google who pumped in money in Waymo. Have fun using ddg and Firefox.


> Just shows your biases

> Have fun using ddg and Firefox

And you've shown yours.


That is not true at all. You are regurgitating half baked theories of HNers who are just salty at Google (some balk at their success , some didn't get in etc .). Google's going to keep changing the world. HN is going to keep complaining.


Ego is a helluva drug, all right.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: