본문 바로가기

Back-end/MongoDB

MongoDB: 고급 쿼리(Advanced Queries - Aggregation Framework)

반응형

Aggreagation Pipeline

Aggregate() function is used for creating pipelines

  • db.COLLECTION_NAME.aggregate(pipline,options)

Documents enter a multi-stage pipline that transforms the documents into an aggregated result

→ document는 multi-stage pipline을 통과해서 aggregated된 result를 반환

SQL to Attregation Mapping Chart

Aggregation Pipeline Stages

  • $match
    • { $match: {query}}
    • aggregation pipeline은 가능하면 앞에 둔다.
      • aggregation pipeline안 document 수를 제어한다.
      • pipe에서 처리하는 양을 최소화한다.
  • $project
    • {$project: { specification }}
    • 특정 필드만 다음 stage로 통과시킨다.
  • $group
    • 값으로 document를 그룹짓는다.
    • { $group: { _id: expression, field1: { accumulator1 : expression1 }, ... } }
      • _id: 필수
    • Accumulators
      • $max, $min, $avg, $sum
      • $addToSet, $push
      • $first, $last
  • $unwind
    • {$unwind: field path}
    • array field를 분해해준다.
  • $sort
    • { $sort: { field1 : sort order, field2: sort order ... } }
      • 1: 오름차순
      • -1: 내림차순
    • 정렬해준다.
  • $limit
    • {$limit: positive integer}
    • 다음 스테이지를 넘어가는 document의 수를 제한한다.
  • $skip
    • {$skip: positive integer}
    • 다음 스테이지로 넘어가는 document를 앞에서부터 n개 skip한다

Practice

  1. State-wise total population
db.zips.aggregate({$group:
    {
        _id: "$state",
        population: {$sum: "$pop"}
    }
})
  1. Count of zipcodes for states
db.zips.aggregate({$group:
{
    _id: "$state",
    population: {$sum: "$pop"},
    zips: {$sum:1}
}
})
  1. Sort cities by number of zips
db.zips.aggregate({$group:
{
    _id: "$state",
    population: {$sum: "$pop"},
    zips: {$sum:1}
}},
{ $sort: { "zips": -1 }})
  1. Sort the states with more than 1000 zips
db.zips.aggregate({$group:
{
    _id: "$state",
    population: {$sum: "$pop"},
    zip: {$sum: 1}
}},
{$sort: {"zip": -1}},
{$match: {"zip": {$gt: 1000}}})

Task

  1. Show only state name and location of BELL GARDENS
db.zips.aggregate(
    {$match: {"city": "BELL GARDENS"}},
    {$project: {"_id":0, "state":1, "loc": 1}}
)
  1. Show five most populated cities of New York state
db.zips.aggregate(
    {$match: { state: "NY"}},
    {$sort: {pop: -1}},
    {$limit: 5}
)
  1. Show all states with total population of greater than 10 million and sort the result
db.zips.aggregate({$group:
{
    _id: "$state",
    population: {$sum: "$pop"}
}},
{$match : {population: {$gt: 10000000}}},
{$sort : {population: -1}})
  1. Rewrite the above query using $group
db.zips.aggregate( [
{$match:{state: "CA"}},
{$sort:{pop: -1}},
{$limit: 1}
])

db.zips.aggregate(
{$group:
{
    _id: "$state",
    maxpop: {$max: "$pop"}
}},
{ $project: {
    _id: 0,
    state: "$_id",
    maxpop: 1
}},
{ $match: { state: "CA"}})
반응형

'Back-end > MongoDB' 카테고리의 다른 글

MongoDB: 중급 쿼리 (Manipulating Data - Intermediate Query)  (0) 2022.04.29
MongoDB: 기본 쿼리(Basic Quries)  (0) 2022.04.29
MongoDB Overview  (0) 2022.04.26
Big Data Storage  (0) 2022.04.26
Introduction to Big Data  (0) 2022.04.26