There I was, once again ensnared by yet another scam from an e-commerce platform that boasted a seemingly reliable average 5-star rating. An all too familiar scenario that has happened to me numerous times. In an act of revenge and a desperate need to find a solution, I began to scrape reviews of the scam products and their genuine counterparts from reputable sources β a purchase I should've made instead.
This was the genesis of my growing skepticism towards the traditional 5-star rating system. To counter these deceptive practices, I began concocting a new recipe, one that not only takes into account the rating but also integrates other pivotal elements like 'Helpful Feedback' and 'Seller Replies.' Let me break it down:
Helpful Feedback: This quantifies the number of people who found a particular review helpful, regardless of its sentiment β positive or negative.
Seller Reply: This Boolean value indicates if a seller has responded to a review or not, still regardless of its sentiment β positive or negative, shedding light on their engagement with customers.
Rating: The traditional 1 to 5-star rating system we all know.
As I delved deeper into these thoughts, they followed me everywhere, even to the bathroom. It occurred to me that while scammers could marshal an army of fake high-rating reviews, honest reviewers still held the power to counter them. They could submit low-rating reviews, upvote similar experiences and provide what we now term as 'Helpful Feedback.'
"How can I amplify the voices of these unsung heroes, these honest reviewers, to create a significant impact?" I thought to myself. And, in this spirit of resilience, I did exactly that!
The idea was to fuse these three components together, forging a novel rating system that could potentially distinguish scam stores from honest ones. Despite my initial attempts, the balance wasn't perfect. I tried all three combinations, grouping them into 'high' and 'low' classes, but to no success. You can see all these attempts in the Appendix.
However, I remained undeterred. This system wasn't merely a mathematical model; it was a voice for the deceived, a beacon of honesty in a sea of deceitful stars. It was a testament to the strength of genuine feedback and responsive sellers β the true pillars of e-commerce.
For now, it's a work in progress. It's an idea born out of necessity and refined in the crucible of skepticism. This is not the end; it's just the beginning. Armed with my experiences and these three pillars of e-commerce β Rating, Helpful Feedback, and Seller Reply β I ventured into a new frontier, trying to transform this intricate problem into a manageable one. Here, I discovered an interesting intersection of necessity, resilience, and heuristic methods. This unique blend of elements ultimately led to a novel concept β a heuristic approach to our e-commerce rating problem.
A Heuristic Approach: The Helpful Rating System π€
Heuristic, stemming from the Greek word Ξ΅α½ΟΞ―ΟΞΊΟ, meaning "I find, discover", is a concept in mathematical optimization and computer science. It involves solving problems faster when traditional methods fall short, either in speed or in finding an exact solution. This typically involves trading accuracy or precision for speed. Essentially, a heuristic is a shortcut, offering an approximate solution good enough for the problem at hand, and significantly quicker to obtain.
With this concept in mind, I developed the 'Helpful Rating System', prioritizing people's feedback over mere star ratings. To ensure the impact of helpful feedback on the new rating, I followed these rules:
For each person who finds a particular review helpful, it will count the same as the review itself. This means, regardless of whether the review is positive or negative, it will be multiplied by the number of people who find it helpful.
Not only the people who write the reviews will be counted, but also those who take their time to upvote other reviews that they find helpful.
In the following sections, we will explore the implementation of the Helpful Rating System using both a dataset for an honest store and a scam store. We utilize the Polars library to perform our calculations and data manipulations.
The dataset we use, which is a product of my web scraping project, comprises different reviews for both honest and scam stores. Below are the first five samples from each dataset:
Honest store dataset:
Scam store dataset:
From these datasets, we only consider two components, rating and helpful_feedback, excluding is_seller_reply. We also create a new column rating * (1 + helpful_feedback) to combine these two factors:
Here we multiply the rating with 1 + helpful_feedback. The idea behind this is to give a higher weight to ratings that are considered more helpful by other users.
For the honest store, our new DataFrame looks like:
And for the scam store, it is:
Next, we calculate the average ratings for both types of stores:
The results for the average ratings are:
For the honest store:
4.903765690376569
For the scam store:
4.091059602649007
Now, we will introduce the Helpful Rating System. This system, visible only to users, adds an additional layer of complexity that makes it more challenging for scammers to manipulate. We observe that when using this system, honest stores typically score higher than scam ones.
The average helpful ratings are:
For the honest store:
4.7317073170731705
For the scam store:
2.9344413665743305
Taking inspiration from the F1-score, which considers two essential components (precision and recall), I formulated a Helpful Rating System score in a similar vein:
The F1-score-like Helpful Rating System scores are:
For the honest store:
4.816200300790103
For the scam store:
3.4175426304044842
For a more in-depth look, you can find the full notebook here. It can be run either locally or on Google Colab.
Comparison π€
The traditional rating system and the proposed Helpful Rating System serve different objectives and have different strengths. We'll now juxtapose these two systems to glean insights into their performance, effectiveness, and resilience against manipulation by malicious actors.
System
Honest Store Average
Scam Store Average
Traditional Rating π
4.903765690376569
4.091059602649007
Helpful Rating System π
4.7317073170731705
2.9344413665743305
While the numerical ratings provide a quick snapshot of the comparison, they only represent a part of the story. The subtleties and potential implications of the Helpful Rating System come to light when we explore it in the broader context:
Resilience against manipulation: The traditional rating system, in its simplicity, is easy for scammers to exploit. They can artificially inflate their ratings, making it hard for users to differentiate between honest stores and scam ones. The Helpful Rating System, in contrast, introduces an additional level of complexity by considering the 'helpfulness' of a review. This aspect makes it more challenging for scammers to manipulate the system.
Representation of customer satisfaction: In the traditional rating system, every rating carries equal weight. However, the Helpful Rating System gives higher weight to ratings deemed 'helpful' by other users. This approach implies a more democratic and customer-centric system, as it better represents the sentiments of the larger customer base.
Store credibility: While the traditional system only provides an average rating, the Helpful Rating System reflects the feedback of the community. Ratings that are more helpful represent the sentiment of more people. Therefore, the Helpful Rating System can offer a more accurate gauge of a store's credibility.
Scalability: While the Helpful Rating System appears to be more complex due to the introduction of the 'helpfulness' factor, it's equally scalable. Both systems can handle large datasets, making them suitable for large e-commerce platforms.
The Helpful Rating System is a significant improvement over the traditional system. It brings a more nuanced understanding of user reviews and ratings, making it harder for scammers to manipulate, and provides customers with a more accurate picture of a store's reputation. While the numerical ratings reflect this to some extent, the real value of the Helpful Rating System becomes apparent when considering these broader aspects.
However, just as every coin has two sides, our Helpful Rating System also presents certain trade-offs and potential pitfalls. Despite the numerous advantages it offers over the traditional rating system, there are also considerations and possible edge cases that we must acknowledge and address.
Trade-off and Pitfalls π
Like any system, the Helpful Rating System is not without its potential drawbacks and edge cases. While it presents an innovative approach to combating fraudulent ratings and giving a voice to the collective user sentiment, it's important to be aware of potential issues that could arise:
Once the Helpful Rating System becomes mainstream, scammers may devise new ways to manipulate it.
Competitors may pay large groups of people to leave negative reviews and mark them as helpful, making a genuine store appear fraudulent.
If no helpfulness rating is available for a review, it becomes challenging to determine whether a store is genuine or not.
Scammers cannot be identified immediately. We need honest, low rating reviews from victims and helpful upvotes from other users to identify them.
Relying heavily on the 'Seller Reply' factor may not always be indicative of the store's authenticity, as prompt replies could be automated and lack genuine interaction.
The system may disproportionately favor larger stores that have more customer interactions and reviews, thus making it harder for smaller or new businesses to establish credibility.
A potential for 'Helpful Feedback' bias exists as individuals may have varying interpretations of what makes a review 'helpful.'
The system may create a negative feedback loop for genuine stores that have had a few bad reviews, as these could be upvoted and deemed helpful, thereby harming the store's overall rating unfairly.
Future Work π
The development and implementation of the Helpful Rating System present numerous opportunities for future work. While it demonstrates promising results, it's important to note that the system remains a work in progress. This preliminary stage offers a springboard for more extensive studies and enhancements that can refine and optimise the system further. Here are several areas that future work might consider:
Extensive testing: The Helpful Rating System needs rigorous testing using larger datasets, which include both rating and helpful feedback information. Ideally, these datasets should also include labels identifying stores as either honest or scams, allowing for an accurate assessment of the system's effectiveness.
Real-world application: Future work could involve a pilot implementation of the Helpful Rating System within an actual online marketplace. This would provide valuable insights into how the system operates in a real-world environment, the response of users to the system, and its impact on their purchasing decisions.
User studies: Understanding the user perspective is critical. Future work might involve conducting surveys or interviews to gauge users' comprehension, acceptance, and trust in the Helpful Rating System.
System enhancements: As with any system, there's always room for improvement. Future work might explore potential enhancements to the Helpful Rating System. This might include incorporating additional factors into the rating, such as the reviewer's reputation, or the time since the review was posted.
Handling manipulation attempts: Given that scammers might devise ways to manipulate the new system once it becomes mainstream, future work could explore potential countermeasures. This could involve developing algorithms to detect abnormal patterns of 'helpful' votes.
Evaluation metrics: Future work could also involve defining new metrics to evaluate the Helpful Rating System. These could be aimed at measuring aspects such as the system's resistance to manipulation, the accuracy of its ratings, or its impact on users' purchasing behaviour.
I invite readers who are aware of relevant datasets or interested in collaborating on these future directions to reach out or share their thoughts in the comments section. Let's work together to make online shopping safer and more trustworthy!
Conclusion π
We understand that the 'Helpful Rating System' is a powerful instrument to distinguish honest vendors from scammers by providing a holistic assessment of user feedback. As a devoted data analyst, I find immense value in sharing these findings and opening them up for further discussions. In that spirit, I warmly encourage you to share or refer to this research. Fresh perspectives could only enhance this system, and who knows, it might even assist a fellow analyst (like myself!) in gaining exposure and potentially landing a job π.
Intrigued by the insights provided here? There's more to come! As I continue refining this system and diving into new projects, I invite you to join me on this journey. You can follow my progress and check out my portfolio repository here
A bunch of Data analysis +AI notebooks I'd worked on almost a daiLY basis
DAIly
A bunch of Data Analysis and Artificial Intelligence notebooks π€ I'd worked on almost a daiLY basis π¨βπ»
Ideas
This directory might contain notes or outlines of potential data analysis or AI projects that I'm considering working on in the future. These might be in the form of brainstorming notebooks, rough outlines powerpoint of project ideas, or notes on interesting data sources or tools that I want to explore further
This directory might contain more practical information, such as code snippets or tutorials that I've found helpful in my data analysis and AI work. These could be tips on how to use specific libraries or tools, how to preprocess data for analysis, or how to approach common data analysis or AI tasks
@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.
beacons.ai
Appendix π
In this section, we delve into some initial concepts and attempts at formulating the Helpful Rating System. Although they didn't quite hit the mark, they provided invaluable insights and learning opportunities. They represent paths once explored and could still hold promise for future iterations or alternative approaches to the problem. It's essential to consider these 'failed' attempts not as dead ends, but as stepping stones towards a more refined and robust solution. The road to innovation is often paved with trials and errors, each one leading us closer to the solution we seek.
Each review contains such following:
Feature
Range
Low
High
π΄ Rating
[1, 5]
[1, 2]
[3, 5]
πΊ Seller reply
True or False
FALSE
TRUE
π₯ People who find this helpful
[0, N]
[Min, Median-1]
[Median, Max]
Total combination of cases based on Low (0)-High (1) group class: 2*2*2 = 8
Case
π΄ Rating
πΊ Seller reply
π₯ People who find this helpful
Case 0οΈβ£
Low (0)
Low (0)
Low (0)
Case 1οΈβ£
Low (0)
Low (0)
High (1)
Case 2οΈβ£
Low (0)
High (1)
Low (0)
Case 3οΈβ£
Low (0)
High (1)
High (1)
Case 4οΈβ£
High (1)
Low (0)
Low (0)
Case 5οΈβ£
High (1)
Low (0)
High (1)
Case 6οΈβ£
High (1)
High (1)
Low (0)
Case 7οΈβ£
High (1)
High (1)
High (1)
Priority:
π₯ People who find this helpful > πΊ Seller reply > π΄ Rating
Possible pattern of abusing rating system by a scammer:
Case
π₯ People who find this helpful
πΊ Seller reply
π΄ Rating
Interpretation
7
High
High
High
Worst case for scam: High ratings (potentially fake), high seller replies (possibly trying to control narrative), and high helpfulness (other potential scammers supporting the review)
5
High
Low
High
High ratings (potentially fake), no seller replies (negligence), and high helpfulness (other potential scammers supporting the review)
6
Low
High
High
High ratings (potentially fake), high seller replies (trying to control narrative), but low helpfulness (customers aren't agreeing)
4
Low
Low
High
Best case for scam: High ratings (potentially fake), no seller replies (negligence), and low helpfulness (customers aren't agreeing)
Possible pattern of customers try to warn others about the scammer:
Case
π₯ People who find this helpful
πΊ Seller reply
π΄ Rating
Interpretation
3
High
High
Low
Best warning case: Low ratings (highlighting poor service), high seller replies (defensiveness), and high helpfulness (customers agree with the review)
1
High
Low
Low
Low ratings (highlighting poor service), no seller replies (negligence), and high helpfulness (customers agree with the review)
2
Low
High
Low
Low ratings (highlighting poor service), high seller replies (defensiveness), but low helpfulness (customers don't agree or haven't seen the review)
0
Low
Low
Low
Worst warning case: Low ratings (highlighting poor service), no seller replies (negligence), and low helpfulness (customers don't agree or haven't seen the review)
Example data of case distribution from this author's data: