Esercizi aggregation su MongoDB - Zip

file zips.json

Json del documento

{
"_id" : "01001",
"city" : "AGAWAM",
"loc" : [ -72.622739, 42.070206 ],
"pop" : 15338,
"state" : "MA"
}

Query da fare:

  • Return the list of cities by state
  • Return the states with population above 10 Million
  • Return the average city population by state
  • Return the city with the highest number of ZIP codes
  • Return the states with the highest and the lowest number of ZIP codes
  • Return the list of zip codes referring to each city ordered by list size
  • Return the largest and smallest cities by state
  • Return the states with city population above the national average


Group Accumulator Operators

$sum (Returns a sum of numerical values. Ignores non-numeric values.)

$avg (Returns an average of numerical values. Ignores non-numeric values.)

$first (Returns a value from the first document for each group. Order is only defined if the documents are in a defined order. - Available in $group stage only.)

$last (Returns a value from the last document for each group. Order is only defined if the documents are in a defined order. - Available in $group stage only.)

$max (Returns the highest expression value for each group. - Available in both $group and $project stages)

$min (Returns the lowest expression value for each group. - Available in both $group and $project stages)

$push (Returns an array of expression values for each group. - Available in $group stage only)

$addToSet (Returns an array of unique expression values for each group. Order of the array elements is undefined. - Available in $group stage only.)

$stdDevPop (Returns the population standard deviation of the input values. - Available in both $group and $project stages)

$stdDevSamp (Returns the sample standard deviation of the input values. - Available in both $group and $project stages)

{$sort: { field1: sortOrder, field2 : sortOrder, ... } }

Limit { $limit: positiveInteger }


Varie
$gt (Maggiore - db.inventory.find( { qty: { $gt: 20 } } ) )

1) Return the list of cities by state

db.zip.aggregate([
{ $match: {} },
{ $group: { _id: "$state", state: { $addToSet: "$city" } } },
]).pretty()

2) Ritorna la media delle città ogni stato e quante città ci sono in quello stato

db.zip.aggregate([
{ $group: { _id: "$state", state: { $addToSet: "$city" }, ncity: {$sum: 1} , tot: {$sum: "$pop"}} },
{ $project: { _id: "$_id", ncity: "$ncity" , totPopByState : { $divide: [ "$tot","$ncity"] }}},
]).pretty()

3) Ritorna solo tutte le citta' dello stato di MA

db.zip.aggregate([
{ $match: {state: "MA" } },
{ $group: { _id: "$state", city: { $addToSet: "$city" } } },
{ $unwind: "$city" },
{ $project: { _id:0 }}
]).pretty()

3) Ritorna tutti gli stati che hanno popolazione > 10000000

db.zip.aggregate([
{ $group: { _id: "$state", popstate: { $sum: "$pop" } } },
{ $match: { popstate: {$gt:10000000} } }
]).pretty()

4) Return the city with the highest number of popolation

db.zip.aggregate([
{ $group: { _id: "$state", popCity: { $sum: "$pop" } } },
{ $sort : { popCity : -1} },
{ $limit: 1 }
])

5) Return Largest and Smallest Cities by State

db.zip.aggregate([
{ $group:
{
_id: { state: "$state", city: "$city" },
pop: { $sum: "$pop" }
}
},
{ $sort: { pop: 1 } },
{ $group:
{
_id : "$_id.state",
biggestCity: { $last: "$_id.city" },
biggestPop: { $last: "$pop" },
smallestCity: { $first: "$_id.city" },
smallestPop: { $first: "$pop" }
}
},
{ $project:
{ _id: 0,
state: "$_id",
biggestCity: { name: "$biggestCity", pop: "$biggestPop" },
smallestCity: { name: "$smallestCity", pop: "$smallestPop" }
}
}
])








Query sbagliate:

db.zip.aggregate([
{ $match: {state: "MA" } },
{ $group: { _id: { state: "$state", city: "$city" }, popCity: { $sum: "$pop" }}},
{ $group: { _id: "$_id.city", citypop: {$sum: "$popCity" }}},
{ $sort : { citypop : -1} },
{ $group: { _id: "NULL", citys: {$addToSet: { name: "$_id",pop:"$citypop"}}}},
{ $unwind: "$citys" },
]).pretty()