Fun to see this up here. Originally when I built this there was some discussion on whether it made any sense to show a global view; most people are just going to use the issues dashboard to look at their own issues, obviously. Honestly, I just thought it was cool to have filters across the whole site, so I left it in (which was only an option given how quick it was to calculate and return these results in Elasticsearch — that's also part of the reason the numbers are fluctuating a bit, as some have pointed out here).
Still wish more people knew about this dashboard view into Issues. Even though it's now a prominent link in the header, I don't think the page got to be something I was really happy with — most of the work was done in the final week before we shipped Issues, so it was somewhat an afterthought. There's a ton of power in there, but it's hidden away behind an arcane syntax that I, the creator of the damn thing, can't really remember at this point, two years later, ha. Still dig the overall motivation behind the page, though!
The real question here is not why this is here, but why "explore" was removed (I'm aware I can just go to /explore however and if it is somewhere, it's not clear where). Github is still hard for project discovery.
What's going on with all the repeated comments for those top ones? I'm not a frequent user of Github issues, is it something automated perhaps gone wrong?
Isn't this because deep pagination is really costly? I've seen other production systems (especially using elasticsearch) where this limit is in place.
From Elastic:
>Deep Paging in Distributed Systems
>To understand why deep paging is problematic, let’s imagine that we are searching within a single index with five primary shards. When we request the first page of results (results 1 to 10), each shard produces its own top 10 results and returns them to the coordinating node, which then sorts all 50 results in order to select the overall top 10.
>Now imagine that we ask for page 1,000—results 10,001 to 10,010. Everything works in the same way except that each shard has to produce its top 10,010 results. The coordinating node then sorts through all 50,050 results and discards 50,040 of them!
>You can see that, in a distributed system, the cost of sorting results grows exponentially the deeper we page. There is a good reason that web search engines don’t return more than 1,000 results for any query.
The primary product I work on limits (on-screen) results to 10k results (after any filtering) regardless of page size. Aside: However, this was based on our most complicated queries at the time, which have since been simplified.
Probably off topic, but how do you get such a sort to work nicely (sorting on count from another table), without creating redundant data (in this case, maintaining the +1 count in the main issue table as well as maintaining a separate table for +1s), or just delegating the sort to an external search service?
There's no obvious reason why it would be particularly slow to get counts from another table. You're probably going to want to offer full text searches and searches on columns you wouldn't choose to index online as well though, at which point using a search service makes most sense
Even though there are several open issues in github, how can someone with little development experience or newbie can start contributing.
On asking this question, many may suggest that first we should use the particular piece of code in own project and contribute on that project by raising issues or fixing them. As a beginner, people may start using very popular frameworks like Ruby on Rails or Node.js. Considering it's complexity or maturity, it's extremely difficult if not impossible to start contributing.
I am thinking, somewhere down the line, there is some form of hand holding or mentor ship needed. Where mentor give small task, help in giving some tips or advice, review the first pull request etc. This will definitely boost contribution to opensource projects.
There may be several people providing mentor ship. But I feel it's not structured, how a newbie knows there exist someone who is willing to help. Only way I can think of now is to spam lot of people randomly by looking at their github profiles.
Please suggest how to encouraging new developer to contribute more to opensource and help closing the open issues.
It's hard for experienced people, too. The issue is more about the lack of structure in some open source projects and the time availability to teach "noobs" a codebase. One thing that has worked for me in t he past is to join the development mailing list, try and understand what they are talking about and go look at the code to try and figure out the issue. Then trace back all the discussion to try and find if any of my questions/suggestions have been proposed. If not then I make a very simple case for the solution. If yes then I keep quiet and only comment when things need clarification. Slowly you will pick up the project and be able to contribute.
If the project lacks any kind of communication channels and is hosted on some online repo then by all means open an issue and ask about contributing. Make sure to ask about what are the most important issues ton fix and which are the smaller ones but most annoying ones. Offer yourself to document the project too.
It's not easy but it is fulfilling once you get underway.
http://up-for-grabs.net/ attempts to make contributing easier for new developers but it still falls flat IMO, there are few really bite-sized issues you can tackle and even those are going to require you to read a lot of project code and discussion to figure them out.
Django has a django-core-mentorship mailing list[0] for people interested in starting to contribute, a guide on contributing[1] and a selection of issues tagged as easy-pickings[2] that are suitable for beginners to work on.
I haven't personally tried it, but I did think it was cool when I stumbled over it.
I don't think that list is very active unfortunately. Also the easy pickings list has been mostly completed which doesn't leave a whole lot of room for newbies to contribute.
Funnily enough, having Tim (a paid contributor, also Core dev) do so much of the community work means there is less low hanging fruit for new contributors to get stuck in to.
IRC is often your best bet. If you find a project you'd like to contribute to, see if they have an IRC channel, there will always be regulars there who have a lot of experience with the projects and will almost definitely have advice to give to beginners wanting to contribute.
Node is extremely friendly to new developers and has labels for "good for beginner" issues as well as a community very passionate about helping others. You should give it a try before giving up.
BTW, contributions can mean documentation or website markup. You probably won't fix a major bug right off the bat.
I wonder if one day GitHub will announce the World's Issue Closing Day. The day every programmer will try hard to close their issues. Though, isn't it what we do every day?
/t/, /d/, and /θ/ are all pretty close, but I've never thought of it as alliteration. But I suppose it does count. Thanks for expanding my literary toolbox.
A bit late to the party. I find that many maintainers are left with a mountain of issues and very few eyeballs to help process them. I made a tool that helps others get involved with your open source projects to, hopefully, help keep your issue count manageable. Check it out: https://www.codetriage.com
Why is the default issue filter "is:open"? When I have an issue with a project, I never want to restrict focus to open issues. In fact, I'd much rather land on a closed issue where it turns out the issue was recently fixed, or there is a workaround, a better approach, etc.
Interestingly enough, when refreshing the count of closed varies wildly, and when looking at closed issues, the count of open varies wildly +/- a few million. I wonder what causes that.
My guess? They're giving an estimation based on talking to a few shards of a much larger sharded system rather than trying to actually get canonical results for every shard - since it's unlikely that you need a precise count across that many repositories (which would be really expensive to calculate in real time).
Giving a 401 indicates that there might be a resource, though, which can also be harmful.
It is fairly common to return a 404 to unauthorized users (or users with not enough permission) so you don't give away meta information. Granted, for the public search, it should return an appropriate error code but they should not do that for private repositories. Thus it think it is fair to assume that they have a policy: if user/guest does not have sufficient permission, always return an error 404.
That makes sense for endpoints like /admin, but it's more confusing than it's worth for users when the endpoint is otherwise rather public. Well, just see this comment thread.
As an example, in this case with the /issues page, redirecting to `/login?redirect-to=/issues` would be more user-friendly since it signals that the page exists but you must authenticate.
I assume to prevent exposing the names of private repositories, correct? For the main(global) search page it would seem reasonable easy to just omit that from the search results.
This number sounds like the number of unread emails in some inboxes. Some have embraced Inbox Zero - is there a similar movement for issues, something like "Bug Zero"?
I've had a policy of no known bugs for a long time, no matter how trivial they are. I'm lucky though in that I don't have a manager sitting over me measuring my rate of feature creation.
So if creating Wikipedia took 100 million hours, closing the worlds GitHub issues might be a task about one order of magnitude smaller than creating Wikipedia...
https://github.com/wting/autojump/issues/353 - Yay I'm mentioned in one of the github open issues (Which actually isn't a issue anymore). Wonder how many such open issues are present, which are worth closing!
So, roughly a third of all issues are open. I think it would be nice if GitHub create a daily/weekly/monthly/annual "State of the Hub" kind of analysis for the entire ecosystem with drill downs and stuff.
It's a SHAME Github is trying to protect its search results.
I am often left in front of this situation when hunting for code using advanced search parameters -- they are preventing people from searching efficiently.
Does anyone know what is their motivation behind this?
Not really sure what you're getting at, but I'm assuming you mean searching for specific syntax or language aspects.
GitHub's definitely not "protecting" shit; it's just that search is a hard problem, and searching code is a really hard problem, at least at the scale they're at. They're running one of the largest Elasticsearch clusters in the world, and a lot of significant things in code are stop words (or not words at all) in most search databases. Not to mention you need to invalidate entire repo indexes when you force push, etc. It just takes a lot of resources, and like anything, will get better over time.
I was under the impression that since the page returned 404 after being posted here, they removed the ability to search using these filters, at the very broad range it was used at.
Now the page is back and I'm not sure what to make of it.
It's not going to be an easy job to be fair - I also find the search frustrating - I would appreciate the creation of an overarching (elasticsearch?) index across all their stores but I would quake at implementing it.
It's a frustrating thankless task to do it of course, but looking for a competitive moat - that will make gitlab and Atlassian quake.
Still wish more people knew about this dashboard view into Issues. Even though it's now a prominent link in the header, I don't think the page got to be something I was really happy with — most of the work was done in the final week before we shipped Issues, so it was somewhat an afterthought. There's a ton of power in there, but it's hidden away behind an arcane syntax that I, the creator of the damn thing, can't really remember at this point, two years later, ha. Still dig the overall motivation behind the page, though!