Frank's Blog - Yes, Finally
Back to the Home Page
Go to Older Rants
Crowdsourcing - 9 March, 2016

Crowdsourcing is the new buzz word of the Interwebby.  If you have never heard of it, it is the idea that, if enough people get involved in the
discussion, you will eventually get enough information to do something.  It is essentially the digital version of democracy.  

Like democracy, in theory, it is a good idea.  In practice, it doesn't always work.  Sometimes you get George Washington and Thomas Jefferson.  
Sometimes you get Millard Fillmore and Andrew Johnson.  The fundamental problem is the same.  You never know anything about the people
doing the voting.

In the cruising world, the biggest crowdsourcing application is something called Active Captain.  There are a few others set up to compete or
contrast with AC, but AC is the big player right now (I say right now because, in the digital world, last week is last year, and a decade ago was the
Stone Age.  Just ask the guys who created MS-DOS).  It is integrated into a significant number of chartplotter systems and allows the user to easily
read about hazards, marinas, anchorages and other items of interest associated with the marine world.  In addition, each reader with an account
on Active Captain can add their own comments.

In theory - and sometimes in practice - this is a very good system.  One of the best sources of information we have out here is called "local
knowledge."  It is the sort of information you can only get from people who live in an area.  They are the ones who have been through a certain
pass a hundred times last year and know that you have to "stay to the right side over near Green 37, because the bank is shoaling up over in that
area" or that "you really need to avoid the area by that big rock, 'cause there's a tree stump that will snag you if you are not careful."  

When we lived in the marina in Richmond, Virginia, we knew that there was a sand bar about halfway down the creek that we had to go over to get
out.  When we first went in there, we got caught on that sand bar and had to back off and hunt for deep water three times before we could get past
it.  When we told that story to the marina manager, he pointed out a big white pine tree a little further down the creek from the sand bar.  He told us
that, if we started ten feet off the end of the marina docks and aimed for that pine tree, we would go over the sand bar at the deepest part.  We did
and never had a problem with that sand bar again.  That's local knowledge.

However, there is a difference between talking to one guy who has gone over a particular point a hundred times and talking to one hundred people
who have gone over that same point once.  If I do something a hundred times in all different sorts of conditions, I learn what works and what didn't.  
If, on the other hand, I go through one time, I learned what worked that day.

More importantly, unless something goes wrong, I don't know what didn't work and I don't know why it didn't work.  When I teach in Rock Hall, I know
that, when going out to the area I like to sail in, there is a stretch of water that I really need to stay to the right side of the channel in order to keep
from going aground.  I know this because, out of the dozens of times I have gone out there, I have gotten caught three.  Not only do I know where
the bottom shallows out, I know what the bottom feels like and how it contours.  I know this because I have had to use that information to get out of
there.  It took three tries to get that much information.  The first time I went through there, all I learned was that the water was deep enough to get
through that day.

The important phrase here is "that day."  When we get local knowledge, we are looking for information developed over time, by trial and error.  
Most crowdsourcing fails for three reasons - or maybe "fails" is the wrong word.  It suffers for three reasons.

The first problem is that of summarization.  When I ask someone who has been through a certain point for his local knowledge, he gives me a
summary of his one hundred trips over the area.  He doesn't tell me about each of his one hundred trips.  He takes the sum of his knowledge and
gives me the important overview, maybe with some details.  On the other hand, with crowd sourcing, there is very little summarization.  I have to
read through a hundred different comments in order to develop my own sense of where the problems are.  While some might say that that is a
small price to pay, when you are dealing with dozens of "problem sites" a day, reading a hundred or so comments on each one could take up the
entire day.  It becomes much easier to read two or three comments on each one and hope that you have gotten the gist of the problem.

The second problem is observation bias.  Everything I see and hear gets filtered through my own biases.  If I boated all of my life on the
Chesapeake Bay, then traveling through the ICW, where things tend to be close in and narrow, I might say that the ICW is "too confining."  If I
boated all of my life on the rivers and creeks of the ICW, I might find the Chesapeake Bay "too open."  If my boat is a twenty foot day sailor, the
ICW is an "enjoyable drift through majestic old forests."  If it is a fifty foot cruising sloop with a 63 foot mast and a six foot keel, the ICW is a
"nerve-wracking collection of twists and turns through forests that try to grab your boat and pull you in."  If I spend all my time in high dollar
marinas, my opinion of any given marina is going to be very different than one from someone who spends most of their time anchored and only
goes into a marina when necessary.

There is really no way to avoid observation bias, we simply need to know it exists and, for the most part, we can work with it.  When I read a movie
review in the newspaper (yes, I still do), I consider other movies that the same reviewer has commented on that I have seen and how well her
comments correspond to my opinions.  In some cases, the reviewer and I might agree completely on a certain type of film.  In those cases, I know
that her observation bias and mine are similar.  On the other hand, she may hate certain types of films that I love.  That is just as much good
information, since it tells me that her observation bias is different from mine and if she tells me a film is terrible, I will probably like it.

The key to observation bias is that you have to know enough about the person giving you advice to know how well their bias lines up with yours.  
You can only get this information by having a significant number of their observations of the same thing you have observed.  In crowdsourcing, this
means that you have to find people that have made a lot of comments about different things and you have to have gone through the same areas
to determine if you agree with their observations.  Ten comments about how wonderful a marina is are useless if the people all making the
observations are different from you.

This leads us to the third problem.  The theory of crowdsourcing is that, while we know that some chaff will get thrown in with the wheat, there will
be so much wheat that the chaff won't matter.  We know that some of the people commenting are not "experts."  They may feel they are experts,
they may be able to convince everyone in their own circle of friends that they are experts, they may even be experts in their own little section of the
world.  But that doesn't make them experts.  Their comments may be useless, nonsensical and even dangerous (I have seen all three).

But, without the experience of dealing with them, I don't know.  So, if I have ten comments on an area and three of them say one thing and four of
them say something different, how do I know which ones are "wheat" and which are "chaff."  If I am reading a series of reviews on a given marina,
how do I know which ones are transients like me and which ones live at the marina and know all of the staff personally.  Who among the people
adding comments are truly knowledgeable, who talks a good game and who are the ones who can't write a simple sentence but actually know what
they are doing.  This is particularly a problem with people who post information that has been "auto-corrected."  What they meant to say and what
actually got posted can be hugely different.

Finally, in that same category of problems, but with a much darker twist, are the people who simply lie.  We all know they are out there and we all
know that the Interwebby seems to attract them in droves.  Which comments - good and bad - are the ones posted by people who think it would be
funny to send a big power cruiser into a dark swamp or who decide that Jim and Sally Jones are just a little too "high-and-mighty" running that
marina and need to be taken down a peg.  Maybe there aren't a lot of these people "out there" but with literally hundreds of thousands of
comments in the database, to assume there are none would be insane.

Having said all of this, you may think that I am truly opposed to crowdsourcing and that I never use it.  On the contrary, I use it every day when we
are traveling.  Every night, I look over the route we are going to travel and Suzanne and I go through the Active Captain comments for the trouble
spots, trying to get a feel for what we need to be careful of.  We check the tide tables for the areas where people have identified shoaling and we
take a look at the marina reviews along the way to give us an idea of where we can bail out in an emergency.  We use every tool at our command
to try to make our trip as safe and enjoyable as we can.

Because, like democracy, crowdsourcing is the worst way to get information, except for not having any at all.