Building Yelp for graduate programs


A discussion started in the hacker news 3 month back initiated by a thread from Professor Leonrad Cassuto in which he talks about the need for a Yelp for graduate programs. This is the original post from Professor Leonard https://community.chronicle.com/news/2283-why-we-need-a-yelp-for-doctoral-programs

This is the discussion that ensued on the hacker news https://news.ycombinator.com/item?id=23901779

Professor Leonard talked specifically about the doctoral programs and research institutions. In the discussion that went on hacker news, overwhelmingly a majority of people were of the opinion that it is very difficult to solicit honest opinions and reviews from the students about the doctoral programs. It was a pretty lengthy discussion in which people had ideas and thoughts on why this can work and or might not work but based on the discussion it appeared that it would be pretty difficult to get honest ratings and reviews from the students.

And it is true that the ratings business can be very tricky especially when it comes to doctoral or graduate programs. And why will anyone trust a rating from another student who holds some grudge against the institution?

My biggest take from professor Leonard’s post was

What if graduate-school applicants could get that kind of student-centered information in one place, on one website? That information is as important to students — probably more important — than how many citations the faculty have earned for their research publications.

There was another good discussion on reddit about need of a site for graduate school searching

What are student’s pain points?

Looking at the options available for the graduate programs and online masters, these are the major the pain points from a student’s perspective.

  • Online, there is scattered information and unreliable sources: Sites who have listed the course details have only done partially. Often, this data is from colleges that have signed up on the platform and are paying customers. This results in a vast number of colleges being omitted. Content sites who have published the content have unreliable sources 
  • SEO focus: Most of the existing sites have gamed the SEO(Search engine optimization) and are designed for search traffic and lead generation. It is not designed for student needs and interests.
  • Bare minimum information: Most of the time, it is a listing of the college with only ranking information. No details of the programs.
  • A good number of sites that come up in first page of google have mostly listing of the colleges and programs that are paying customers.
  • Wastes time in searching  for colleges and finding admission details
  • Scholarships and financial aid information is hard to find

A student who is going to grad school will spend the next 2-5 years of his life studying that major and shelving out hefty tuition fees. The student should be provided with the best information possible to make the right choice. Each student’s needs are different. Some want flexibility, others are looking for part-time programs, some want no GRE requirements and a big majority is looking for affordability. 

Isn’t this strange and that in today’s time, there is no single website which is able to provide this kind of information? How hard is it to build such a website?

So what do we expect in this yelp for graduate programs to look like? At a bare minimum this is what it should be 

  • List of all the colleges which offer the program.
  • Listings of all the options for programs available for a given specialization
  • Courses details as much as possible
  • Admission details, GPA, GRE,GMAT, TOEFL etc and what it takes to get into that program of that college.
  • Tuition
  • TIme to complete the program
  • An option to compare different kinds of programs

To summarize, we need to build something like this ?

Our quest to build the Yelp for the Graduate Programs – minus ratings

Our team was already on this path and it kind of added more validation to our assumtpion of a need for Yelp or a vertifcal search engine for the Graduate programs. But we do not want to put a ratings. In fact if possible, we will bring in the ratings from facebook or google.

We will build a niche for graduate programs with focus on providing the student with all the data and let he/she make the decision.

Let us look at examples of the some niche for most frequently used categories

  • Books, ecommerce – Amazon
  • Airline Tickets – Expedia, travelocity
  • Hotels – hotels.com, expedia
  • Buying house – zillow,redfin
  • Vacation rental – airbnb, vrbo

So why not build a niche for Graduate programs?

Our vision of what this wesbite should look like

  1. A central isting where in we pull the data, build the listing and then colleges update the information
  2. Present data in an unbiased manner to the students
  3. Build a community with students helping each other
  4. Present information on tuition, acceptance rate
  5. Jobs and career prospect related to that specialization
  6. Faculty information
  7. Provide as much information as possible ot make a decision about shortlisting a college or choosing a program

Step1 : Find the Master list of accredited Universities

The first thing we needed was the listing of all the accredited US universities. This data was available from the IPEDS(https://nces.ed.gov/ipeds/). IPEDS data is mostly on the undergraduate programs with very limited data on graduate programs. But still this was a good starting point for us to at least get the listing of programs.

We picked the list from IPEDS which served as the basic building block. We thought with this base url, we can pull data from the colleges and different programs. As a programmer the easiest thing was to write the crawlers and get the data from the college’s websites.

What we realized is that this extraction and categorization of this data was one of the most difficult tasks. It is the categorization of this data into the right bucket that became a huge challenge for us. 

And remember, we are only focussed on graduate programs which could be a PhD, graduate or masters program and not an undergraduate program.Initially it appeared to be a simple crawler job crawling the pages, filtering out the data and then organizing into the right category.  But it was becoming very challenging  for us. Maybe our team didn’t have the skills or maybe we are not smart enough to do this.

Hbo GIF by Silicon Valley

We took a break. Went back to the drawing board and started thinking on how to solve this problem. After a few weeks of brainstorming, we tried a different approach and only focussed on one specialization from 50 universities to prove that our new logic works. And we did get success.It took us a good number of iterations to figure out how to filter out the data without human involvement.

This is how the information has been organized

Step 2: Where are my GRE and GMAT scores ?

Once we get to this then the next task was how to extract the GRE and GPA scores from the pages. We found out that the GRE scores are only available on US news website behind a paywall.

Our team was able to solve this challenge and extract the information out of the pages. It was interesting to find a good number of universities do require GRE for admissions to their programs. 

We ended up creating a special category “No GRE Universities” for these programs.

Problem Solved GIFs - Get the best GIF on GIPHY

Step 3: What about scholarships and financial aid?

With this success, we became confident(read overconfident) and aimed for extracting and organizing the scholarship and financial aid information. 

A classical start-up mistake where you try to do too many things and not great at solving just one problem.

We started on this task and after a while realized that we haven’t yet mastered how to solve the problem for the students for courses and scores and now are jumping into something which is gonna take a very long time for us to solve. We had thought of launching the scholarship search in beta in 3 months. After some deliberation, we decided not to work on scholarship and focus only on courses and scores until unless we are able to solve 90% of the problems.  

There is no point going forward and doing so many things and not becoming really good at any one of them.

GRE,GMAT scores and GPA information

So far we have been able to extract the information from

https://www.collegehippo.com/college/new-york-university/graduate-programs/educational-instructional-technology-gre-score

Gre scores

Summarizing the journey

October 2020. It has been 2.5 years since we started this journey and see some glimmer of hope with students started to use our site. We are getting inquiries and questions about the programs and career advice. We got some great feedback on what to improve on the site, some features to add and we are working on them.

We got accepted into the AWS EdStart program a few months back which gave us 10k credits. This helped us save some money on the server costs.

How are we going to monetize the business?

I thought that we could monetize this in two ways

  1. Students pay  a subscription fees since they are saving so much time and application fees when searching for colleges
  2. Colleges will pay us for advertising on our site. 

Sadly none of the 2 options have worked out so far. We are working on it and will keep our readers updated about how it goes into the future. 

Leave a Reply