Better data processing and recording mode

Suggestions for WiGLE/JiGLE/DiGLE

6 posts • Page 1 of 1
I am using the f-droid version on i-9082. I don't use play store. I scanned a new area including a mall. The outside by car and inside by walking. Gps was present in the upper open area but intermittent due to signal reflections. Some ground floor door areas had gps.
The resulting map had a big cluster of points but a couple of rays extending out in random directions for a km or so. When I look at cities I see the same phenomenon. It seems that gps can be unreliable unless account is taken of the device being hand held and the person walking. Glitches should be positioned central to the averaged position with outliers removed.
It should be possible to do this without a user interface but I would be ok with a menu choice for auto, foot, car, bike, mall, etc.

Inside malls with many access points it should also be possible to make a non-gps map based on initial readings when gps was present. Walls can be detected by abrupt signal drops. A menu could ask for current floor information.

I hope these rays can be processed out of city maps. I don't think they are small aircraft as there are no course changes.
Hi Dave,

Thanks for the feedback. Our experience has been inconsistent GPS/driver behavior under android is responsible for the mast majority of these cases. Generally, rather than reflections, we receive high-reported accuracy, high signal-strength data which is simply incorrect. We have means of detecting this, and are continually incorporating new idea. We continue (behind the scenes) to process out these sources of errors and inaccuracy, but rather than mass-reprocessing of all observations, we recalculate when new data for the AP in question arrives, which means that "retired" access points with bad data will hang around. If you tighten the timebound of these access points, do you still encounter the "rays" reaching out from cities?

Cheers and thanks for the feedback,

-Ark
Hi Arkasha,
I'll try taking more scans. One possible reason might be this. If there are two reported almost identical strong readings say 10m apart, in fact due to gps jitter, then the range / transmitter strength equations' solution to the two data points, as almost no signal drop off, might be a very strong transmitter strength but at great range. So jitter in one direction gives these solution points in a ray, but in fact the transmitter strength for the solution would be way too high. That would be a way to detect unrealistic solutions, look at the estimated transmitter strength.
I propose solving in malls by switching to non-gps and finding a wholistic 3d map solution, disregarding sample point readings for stations when they are anomalously low (due to wall or human in the way) compared to others. May take some work and iterations, alternatively if signal strength is obviously in error use the current position with big error bars for the map, that at least gives a general idea of location.
The rays were coming from my new data and weren't there before.
By time bound do you mean display recent map point data?
There are various time settings in the scanning phone if you mean these then I'll need to know what to try.
I did in fact sit at the top of the mall in a rest area and noticed the reported height drift from 5 to 200 feet over time (10 min or so), they probably take the max for a location
Thanks
David
Hi David,

Frequently, the GPS signal is the problem independent of echo or occlusion - the phone reports "I've got a great fix" when in fact it's total garbage. It varies widely by phone model and location. We have a little library of techniques to correct it - send us a transid in PM and we can show you "before and after" for the cleanup, but in the end, the server is opportunistic about data - a bad "ray" is better than no data at all for the point. (you can always view the "after" in google earth by clicking on the transid in your uploads page, which is also useful for debugging location issues!)

Regarding the time-bound, the website and client both allow you to filter mapping data - the website has a first-and-last slider - the client just has a "since year" adjustment in Settings -> Map Settings.

Indoor mapping is a pretty thorny problem - specifically finding the 3d structural mapping data that's accurate and up-to-date. Any suggestions? ;)

Cheers, thanks for the detailed info, and happy stumbling!

-a
Hi Arkasha,
I do watch indicators for walking speed and height. These are mostly reasonably reported. Sometimes with poor gps I can see 2 mph when stationary occasionally, in car they are fine and outside scans in streets work well. I've never seen any changes that could be a ray forming so maybe I shoúld look at the data? I don't know how to access the data transmitted to you or what local processing is being done so it makes it difficult suggesting what you could reasonably do. This would be nice, having the code and flowcharts available.

As mentioned before, in malls height can be in error, even be negative and tends to drift up.

If there are brief glitches where position suddenly changes these could produce rays, but instantaneous position changes should be removable. But there are issues knowing the right context, ie mode of trsnsport. To be specific, I'd initially suggest looking for rays, with software picking them out, they are quite obvious on maps and the nearby known networks can be used for rough correct position when gps seems impossible due to high accellerations. Really this needs inspection of data to find identifying features. If I get time I'll try to, but the position was https://goo.gl/maps/nELkfPpMvHv in case you have a way to find relevant files. No one else scanned the area.

On the more general question of doing malls in 3d it is difficult as there are walls and floors as well as the body of the person holding the phone. There is also a lot of data though, and this can be a good thing. I've started trying to do a run on the outside to give approximate positions for the outer wifi stations. Once these are known they can be used inside for reference. Gps often comes in through windows along the top of chambers and doesn't seem too bad when present, except height, but seems to produce these rays. The strongest signals when 15 or more wifis are visible are less likely to have walls in the way. This would mean looking for outside readings first over many runs and files.
Height is a big problem too. I think that at this stage a floor level should be available for selection, with that *known reference* a lot more sense can be made of the data. Also definite selectors should be setable while scanning, you could have walking, bike, car, plane options.
The app is good in most situations, review of data before transmission could be an option, with possible edits for the real time path if it has glitches, but I suspect many of these problems would vanish if the general situation was known and sensible checks made to see if the user really has moved ten miles, by seeing what wifi is also visible nearby.
As I say, zoom out on any big city and you see many of these rays.
But outside it generally works fine.
David
Hi Dave,

The client is open-sourced: https://github.com/wiglenet/wigle-wifi-wardriving - you can read how we log and report there, and PRs are very welcome! That being said, we avoid doing much data clean-up client-side because it limits our ability to reconstruct/reprocess server-side later, as we develop new techniques and find new issues (you can't re-interpret what you don't keep)

The rays are uniquely hard to spot when they happen - they're usually brief as the fix is lost internally and then re-established, but when you're in motion, the access point may not be re-observed, or may be re-observed with a weaker signal. Most of the problem is that the GPS core on the phone indicates a high degree of confidence that the GPS lock is valid at the time, even though it's clear to visual inspection that it's not. The most common sign that this has happened is a bad timestamp - since GPS relies on clock synchronization at its core, we try and spot bad clocks and classify accordingly on the server-side. The problem with simply doing time/location displacement checks is distinguishing between the valid and invalid behaviors: is the sudden jump the error or the correction? The numeric technique most commonly used to prevent sudden fluctuations in a time series like these is Kalman filtering - we've avoided incorporating it into the client to this point for the reasons stated above regarding discarding/altering source data.

Elevation in consumer-grade GPS hardware is particularly unreliable - it's usually a significant multiple of the horizontal dilution of precision - it can vary between +/- 23' - 400' even with a strong (read: close enough for practical location purposes) horizontal fix. There are a lot reasons for this inherent in the shape of the earth and GPS constellation, as well as different modeling techniques. Consumer-grade GPS vendors will all caution you against relying on it for navigation.

Rather than specifying modes of locomotion (since people are always finding new ways to send android devices to new places), we allow speed-based scanning settings - how frequently we update GPS and scan signals can vary with how fast you're moving. Because radio signals aren't necessarily planar and radio and antenna configurations vary widely by device, and modeling RF interactions with structures and topology is massively complicated we haven't found it practical to dig too far into the specific properties of individual devices. This would be a logical but effort-intensive next step.

While the rays are clearly inaccuracies, the decision not to simply ignore or remove them is intentional - they're still signal, even if not particularly high-quality signal. They appear and disappear over time, but as number of observations for any individual point increases, they descend rapidly to zero for that point. So long as individual devices have quirky GPS implementations, rays will always be possible, but they'll self-correct over time. If you're using the data for other purposes, you can easily ignore low-QoS access points and save yourself any trouble they might introduce.

Cheers,

-Ark

6 posts • Page 1 of 1

Return to “WiGLE Project Suggestions”

Who is online

Users browsing this forum: No registered users and 5 guests