How to process more than 350K requests per month free using 3 free ETA services instead of 1 paid
Mad Devs
Posted on July 6, 2020
This is a story on how to not spend even a penny by using three ETA (estimated time of arrival) services instead of one. Everything is based on my personal experience working as a back-end developer at GoDee project. GoDee is a start-up project that offers booking seats on a bus online.
Prehistory
GoDee is a public transportation app. Bus transportation by GoDee is more convenient than motorbikes common for Southeast Asia and cheaper than a taxi. The app-based system allows users to find an appropriate route, select the time, book the seat, and pay for the ride online. And one of the problems of GoDee is traffic jams that severely impact the user experience. Users get tired of waiting and get annoyed by trying to guess the bus arrival time. So, to make the commuting more convenient, it needed service to calculate the bus’s approximate arrival time, aka ETA.
Developing ETA from scratch would take at least a year. So, to speed up the process, GoDee decided to implement the Google Distance Matrix API tool. Later they developed their own Pifia micro-service.
Problems
Over time, the business grew, and the user base increased. We encountered a problem with increasing requests in the Google Distance Matrix API.
Why is this a problem?
Because every request costs money, Google API provides 10.000 free queries per month, after which every 1.000 queries are charged $20. At that time, we had about 150,000 requests per month.
My mentor was very dissatisfied with that. And said that system should change cashing to store ETA every 30 minutes. At that time, the system sent requests to the Google API every 3 seconds to get fresh data. However, such a cashing algorithm wasn’t efficient, since minibuses were stuck in traffic. And so the distance only changed once every ten minutes. There was another nuance. For example, five users are asking for information about the same bus, and this is the same request. The cache solved this type of problem.
func newCache(cfg config.GdmCacheConfig,
pf func(from, to geometry.Coordinate) (durationDistancePair, error)) *Cache {
res := Cache{
cacheItems: make(map[string]gdmCacheItem),
ttlSec: cfg.CacheItemTTLSec,
invalidatePeriodSec: cfg.InvalidationPeriodSec,
pfGetP2PDurationAndDistance: pf,
}
return &res
}
func (c *Cache) get(from, to geometry.Coordinate) (gdmCacheItem, bool) {
c.mut.RLock()
defer c.mut.RUnlock()
keyStr := geometry.EncodeRawCoordinates([]geometry.Coordinate{from, to})
val, exist := c.cacheItems[keyStr]
if exist {
return val, exist
}
itemsWithToEq := make([]gdmCacheItem, 0, len(c.cacheItems))
for _, v := range c.cacheItems {
if v.to == to {
itemsWithToEq = append(itemsWithToEq, v)
}
}
for _, itwt := range itemsWithToEq {
p1 := geometry.Coordinate2Point(from)
p2 := geometry.Coordinate2Point(itwt.from)
if c.geom.DistancePointToPoint(p1, p2) > 10.0 {
continue
}
return itwt, true
}
return gdmCacheItem{}, false
}
func (c *Cache) set(from, to geometry.Coordinate) (gdmCacheItem, error) {
keyStr := geometry.EncodeRawCoordinates([]geometry.Coordinate{from, to})
c.mut.Lock()
defer c.mut.Unlock()
if v, ex := c.cacheItems[keyStr]; ex {
return v, nil
}
resp, err := c.pfGetP2PDurationAndDistance(from, to)
if err != nil {
return gdmCacheItem{}, err
}
neuItem := gdmCacheItem{
from: from,
to: to,
data: durationDistancePair{
dur: resp.dur,
distanceMeters: resp.distanceMeters},
invalidationTime: time.Now().Add(time.Duration(c.ttlSec) * time.Second),
}
c.cacheItems[keyStr] = neuItem
return neuItem, nil
}
func (c *Cache) invalidate() {
c.mut.Lock()
defer c.mut.Unlock()
toDelete := make([]string, 0, len(c.cacheItems))
for k, v := range c.cacheItems {
if time.Now().Before(v.invalidationTime) {
continue
}
toDelete = append(toDelete, k)
}
for _, td := range toDelete {
delete(c.cacheItems, td)
}
}
func (c *Cache) run() {
ticker := time.NewTicker(time.Duration(c.invalidatePeriodSec) * time.Second)
for {
select {
case <-ticker.C:
c.invalidate()
}
}
}
Alternative services
The cache worked, but not for long since GoDee grew even further and faced the same problem — the number of queries has increased again.
It was decided to replace the Google API with OSRM. Basically, OSRM is a service for building a route based on ETA (this is a rough but the short description, if you need details, here is the link).
The Open Source Routing Machine or OSRM is a C++ implementation of a high-performance routing engine for the shortest paths in road networks.
Wikipedia.
OSRM has one problem: it builds routes and calculates ETA without taking traffic into account. To solve this problem, I started looking for services that can provide information about traffic in the specified part of the city. HERE Traffic was providing the data I needed. After a little study of the documentation, I wrote a small code that gets traffic information every 30 minutes. And to upload traffic information to OSRM, I wrote a small script with the command:
./osrm-contract data.osrm --segment-speed-file updates.csv
You could find more information here).
Math time: every half of the hour, there is a request to HERE to get traffic information this are two requests per hour, that is, a day is 48 requests (24 * 2 = 48) and a month is about ≈ 1.488 (48*31 = 1.488) a year 17.520. Yes, we have these free requests from HERE for 15 years would be enough.
// everything that these structures mean is described here https://developer.here.com/documentation/traffic/dev_guide/topics/common-acronyms.html
type hereResponse struct {
RWS []rws `json:"RWS"`
}
type rws struct {
RW []rw `json:"RW"`
}
type rw struct {
FIS []fis `json:"FIS"`
}
type fis struct {
FI []fi `json:"FI"`
}
type fi struct {
TMC tmc `json:"TMC"`
CF []cf `json:"CF"`
}
type tmc struct {
PC int `json:"PC"`
DE string `json:"DE"`
QD string `json:"QD"`
LE float64 `json:"LE"`
}
type cf struct {
TY string `json:"TY"`
SP float32 `json:"SP"`
SU float64 `json:"SU"`
FF float64 `json:"FF"`
JF float64 `json:"JF"`
CN float64 `json:"CN"`
}
type geocodingResponse struct {
Response response `json:"Response"`
}
type response struct {
View []view `json:"View"`
}
type view struct {
Result []result `json:"Result"`
}
type result struct {
MatchLevel string `json:"MatchLevel"`
Location location `json:"Location"`
}
type location struct {
DisplayPosition position `json:"DisplayPosition"`
}
type position struct {
Latitude float64 `json:"Latitude"`
Longitude float64 `json:"Longitude"`
}
type osmInfo struct {
Waypoints []waypoints `json:"waypoints"`
Code string `json:"code"`
}
type waypoints struct {
Nodes []int `json:"nodes"`
Hint string `json:"hint"`
Distance float64 `json:"distance"`
Name string `json:"name"`
Location []float64 `json:"location"`
}
type osmDataTraffic struct {
FromOSMID int
ToOSMID int
TubeSpeed float64
EdgeRate float64
}
// CreateTrafficData - function creates a cvs file containing traffic information
func CreateTrafficData(h config.TrafficConfig) error {
osm := make([]osmDataTraffic, 0)
x, y := mercator(h.Lan, h.Lon, h.MapZoom)
quadKey := tileXYToQuadKey(x, y, h.MapZoom)
trafficInfo, err := getTrafficDataToHereService(quadKey, h.APIKey)
if err != nil {
return err
}
for _, t := range trafficInfo.RWS[0].RW {
for j := 0; j < len(t.FIS[0].FI)-1; j++ {
position, err := getCoordinateByStreetName(t.FIS[0].FI[j].TMC.DE, h.APIKey)
if err != nil {
logrus.Error(err)
continue
}
osmID, err := requestToGetNodesOSMID(position.Latitude, position.Longitude, h.OSMRAddr)
if err != nil {
logrus.Error(err)
continue
}
osm = append(osm, osmDataTraffic{
FromOSMID: osmID[0],
ToOSMID: osmID[1],
TubeSpeed: 0,
EdgeRate: t.FIS[0].FI[j].CF[0].SU,
})
}
}
if err := createCSVFile(osm); err != nil {
return err
}
return nil
}
// http://mathworld.wolfram.com/MercatorProjection.html
func mercator(lan, lon float64, z int64) (float64, float64) {
latRad := lan * math.Pi / 180
n := math.Pow(2, float64(z))
xTile := n * ((lon + 180) / 360)
yTile := n * (1 - (math.Log(math.Tan(latRad)+1/math.Cos(latRad)) / math.Pi)) / 2
return xTile, yTile
}
// http://mathworld.wolfram.com/MercatorProjection.html
func tileXYToQuadKey(xTile, yTile float64, z int64) string {
quadKey := ""
for i := uint(z); i > 0; i-- {
var digit = 0
mask := 1 << (i - 1)
if (int(xTile) & mask) != 0 {
digit++
}
if (int(yTile) & mask) != 0 {
digit = digit + 2
}
quadKey += fmt.Sprintf("%d", digit)
}
return quadKey
}
// requestToGetNodesOSMID - function for getting osm id by coordinates
func requestToGetNodesOSMID(lan, lon float64, osrmAddr string) ([]int, error) {
osm := osmInfo{}
// here it is necessary that at the beginning lon And then lan
// WARN only Ho Chi Minh
url := fmt.Sprintf("http://%s/nearest/v1/driving/%v,%v", osrmAddr, lon, lan)
resp, err := http.Get(url)
if err != nil {
return nil, err
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("Status code %d", resp.StatusCode)
}
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return nil, err
}
err = json.Unmarshal(body, &osm)
if err != nil {
return nil, err
}
if len(osm.Waypoints) == 0 {
return nil, fmt.Errorf("Nodes are empty, lan: %v, lon: %v", lan, lon)
}
return osm.Waypoints[0].Nodes, nil
}
// https://developer.here.com/documentation/geocoder/dev_guide/topics/quick-start-geocode.html
// getCoordinateByStreetName - function of the coordinates by street name
func getCoordinateByStreetName(streetName, apiKey string) (position, error) {
streetName += " Ho Chi Minh"
url := fmt.Sprintf("https://geocoder.ls.hereapi.com/6.2/geocode.json?apiKey=%s&searchtext=", apiKey)
gr := geocodingResponse{}
streetNames := strings.Split(streetName, " ")
for _, s := range streetNames {
url += s + "+"
}
resp, err := http.Get(url)
if err != nil {
return position{}, err
}
if resp.StatusCode != http.StatusOK {
return position{}, fmt.Errorf("Status code %d", resp.StatusCode)
}
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return position{}, err
}
err = json.Unmarshal(body, &gr)
if err != nil {
return position{}, err
}
if len(gr.Response.View) == 0 {
return position{}, errors.New("View response empty")
}
for _, g := range gr.Response.View[0].Result {
if g.MatchLevel == "street" {
return g.Location.DisplayPosition, nil
}
}
return position{}, fmt.Errorf("street: %s not found", streetName)
}
func getTrafficDataToHereService(quadKey, apiKey string) (hereResponse, error) {
rw := hereResponse{}
url := fmt.Sprintf("https://traffic.ls.hereapi.com/traffic/6.2/flow.json?quadkey=%s&apiKey=%s", quadKey, apiKey)
resp, err := http.Get(url)
if err != nil {
return rw, err
}
if resp.StatusCode != http.StatusOK {
return rw, fmt.Errorf("Status code %d", resp.StatusCode)
}
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return rw, err
}
err = json.Unmarshal(body, &rw)
if err != nil {
return rw, err
}
return rw, nil
}
func createCSVFile(data []osmDataTraffic) error {
if err := os.Remove("./traffic/result.csv"); err != nil {
logrus.Error(err)
}
file, err := os.Create("./traffic/result.csv")
if err != nil {
return err
}
defer file.Close()
writer := csv.NewWriter(file)
defer writer.Flush()
for _, value := range data {
str := createArrayStringByOSMInfo(value)
err := writer.Write(str)
if err != nil {
logrus.Error(err)
}
}
return nil
}
func createArrayStringByOSMInfo(data osmDataTraffic) []string {
var str []string
str = append(str, fmt.Sprintf("%v", data.FromOSMID))
str = append(str, fmt.Sprintf("%v", data.ToOSMID))
str = append(str, fmt.Sprintf("%v", data.TubeSpeed))
str = append(str, fmt.Sprintf("%v", data.EdgeRate))
return str
}
Preliminary tests showed that the service works perfectly, but there is a problem, HERE gives traffic information in “gibberish” and the data does not match the OSRM format. In order for the information to fit, you need to use another service HERE for geocoding + OSRM (for getting points on the map). This is approximately 450.000 requests per month. Later, OSRM was abandoned because the number of requests exceeded the free limit. We didn’t give up and enabled the HERE Distance Matrix API and temporarily removed the Google Distance Matrix API. The logic HERE is simple: we send coordinates from point A to point B and get the bus arrival time.
type response struct {
Response matrixResponse `json:"response"`
}
type matrixResponse struct {
Route []matrixRoute `json:"route"`
}
type matrixRoute struct {
Summary summary `json:"summary"`
}
type summary struct {
Distance int `json:"distance"`
TrafficTime int `json:"trafficTime"`
}
func HereDistanceETA() (response, error) {
matrixResponse := response{}
query := fmt.Sprintf("&waypoint%v=geo!%v,%v", 0, from.Lat, from.Lon)
query += fmt.Sprintf("&waypoint%v=geo!%v,%v", 1, to.Lat, to.Lon)
query += "&mode=fastest;car;traffic:enabled"
url := fmt.Sprintf("https://route.ls.hereapi.com/routing/7.2/calculateroute.json?apiKey=%s", h.hereAPIKey)
url += query
resp, err := http.Get(url)
if err != nil {
logrus.WithFields(logrus.Fields{
"url": url,
"error": err,
}).Error("Get here response failed")
return durationDistancePair{}, err
}
if resp.StatusCode != http.StatusOK {
return durationDistancePair{}, fmt.Errorf("Here service, status code %d", resp.StatusCode)
}
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return durationDistancePair{}, err
}
err = json.Unmarshal(body, &matrixResponse)
if err != nil {
return durationDistancePair{}, err
}
if len(matrixResponse.Response.Route) == 0 {
return durationDistancePair{}, errors.New("Matrix response empty")
}
res := durationDistancePair{
dur: time.Duration(matrixResponse.Response.Route[0].Summary.TrafficTime) * time.Second,
distanceMeters: matrixResponse.Response.Route[0].Summary.Distance,
}
return res, nil
}
After we installed everything on the test server and started checking, we received the first feedback from the testers. They said that ETA reads the time incorrectly. We started looking for the problem, looked at logs (we used Data dog for logs), logs, and tests showed that everything works perfectly. We decided to ask about the problem in a little more detail, and it turned out that if the car is in traffic for 15 minutes, ETA shows the same time. We decided that this is because of the cache because it stores the original time and does not update it for 30 minutes.
We started looking for the problem, at the beginning we checked the data on the web version of the HERE Distance Matrix API (which is called we go here), everything worked fine, we received the same ETA. This problem was also checked on the google map service. There was no problem. The services themselves show this ETA. We explained everything to testers and businesses, and they accepted everything.
Our team lead suggested connecting another ETA service and returning the Google API as a backup option and writing code with the logic of switching services (the switch was needed if the requests pass the free number of requests).
The code works the following way:
val = getCount() // getting the number of queries used
if getMax() <= val { // checking for the limit of free requests for the service used
newService = switchService(s) // // if the limit is reached, switch the service return
return newService(from, to) // giving the logic of the new service
We found the following Mapbox service, connected it, installed it, and it worked. As a result, our ETA had:
- “Here” — 250,000 free requests per month
- Google — 10,000 free requests per month
- Mapbox — 100,000 free requests per month
Conclusion
Always look for alternatives, sometimes it happens that the business doesn't want to pay the money for the service and refuses it. As a developer who has worked hard on the service, you should bring the task to real use. This article describes how we were trying to connect more services for the free use of ETA because the business did not want to pay for the service.
P.S. As a developer, I believe that if the tool is good and does its job well, then you can pay for the tool’s services (or find Open source projects :D).
Previously published at maddevs.io.
Posted on July 6, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.