您的当前位置:首页正文

A Briefing Tool that Learns Individual Report-writing Behavior

2023-11-11 来源:客趣旅游网
ABriefingToolthatLearnsIndividualReport-writingBehavior

MohitKumar

CarnegieMellonUniversity

Pittsburgh,USAmohitkum@cs.cmu.edu

NikeshGarera

JohnsHopkinsUniversity

Baltimore,USAngarera@cs.jhu.eduAlexanderI.Rudnicky∗CarnegieMellonUniversity

Pittsburgh,USAair@cs.cmu.edu

Abstract

Wedescribeabriefingsystemthatlearnstopredictthecontentsofreportsgeneratedbyuserswhocreateperiodic(weekly)reportsaspartoftheirnormalactivity.Wead-dressthequestionwhetherdataderivedfromtheimplicitsupervisionprovidedbyend-usersisrobustenoughtosup-portnotonlymodelparametertuningbutalsoaformoffeaturediscovery.Thesystemwasevaluatedunderrealis-ticconditions,bycollectingdatainaproject-baseduniver-sitycoursewherestudentgroupleadersweretaskedwithpreparingweeklyreportsforthebenefitoftheinstructors,usingthematerialfromindividualstudentreports.

1Introduction

Inthispaperwedescribeapersonalizedlearning-basedapproachtosummarizationthatminimizestheneedforlearning-experttimeandeliminatestheneedforexpert-generatedevaluationmaterialssuchasa“goldstandard”summary,sinceeachuserprovidestheirownstandard.Ofcoursethiscomesatacost,whichistheend-usertimeneededtoteachthesystemhowtoproducesatisfactorysummaries.Wewouldhoweverarguethatend-userinvolve-mentismorelikelytogeneratequalityproductsthatreflectbothfunctionalneedsanduserpreferencesandisindeedworththeeffort.

Thecurrentpaperdescribestheapplicationofthisap-proachinthecontextofanengineeringprojectclassinwhichstudentswereexpectedtoproduceweeklysum-mariesoftheirwork.Whileeachstudentproducestheirownlogs,theirteamleaderwasadditionallytaskedwith

query-relevanttextsummarysystembasedoninteractivelearning.Learningisintheformofqueryexpansionandsentencescoringbyclassification.[6]haveexploredin-teractivemulti-documentsummarization,wheretheinter-actionwiththeuserwasintermsofgivingtheusercontroloversummaryparameters,supportrapidbrowsingofdocu-mentsetandalternativeformsoforganizinganddisplayingsummaries.Theirapproachof‘contentselection’toiden-tifykeyconceptsinunigrams,bigramsandtrigramsbasedonthelikelihoodratio[4]isdifferentfromourstatisticalanalysisandisofsomeinterest.[13]haveproposedaper-sonalizedsummarizationsystembasedontheuser’sanno-tation.Theyhavepresentedagoodcaseoftheusefulnessofuser’sannotationsinobtainingpersonalizedsummaries.Howevertheirsystemdiffersfromthecurrentoneinsev-eralrespects.Theirscenarioisasingledocumentnewswiresummaryandisverydifferentfromabriefing.Also,theirsystemispurelystatisticalanddoesnotincludetheconceptofahuman-in-the-loopthatimprovesperformance.

[5]describeasummarizationsystemforarecurringweeklyreport-writingtakingplaceinaresearchproject.Theyfoundthatknowledge-engineeredfeaturesleadtothebestperformance,althoughthisperformanceisclosetothatbasedonn-gramfeatures.Giventhis,itwouldbedesir-abletohaveaprocedurethatleverageshumanknowledgetoidentifyhigh-performancefeaturesbutdoesnotrequiretheparticipationofexpertsintheprocess.Wedescribeanapproachtothisproblembelow.

3TargetDomain

Weidentifiedadomainthatwhilesimilarinitsreportingstructuretotheonestudiedby[5]differedinsomesignif-icantrespects.Specifically,thereport-writerswerenotex-periencedresearcherswhowereaskedtogenerateweeklyreportsbutwerestudentstakingacoursethatalreadyhadarequirementforweeklyreportgeneration.Thecoursewasproject-basedandtaughtinauniversityengineeringschool.Thestudents,whoweredividedintogroupsworkingondif-ferentprojectswererequiredtoproduceaweeklyreportoftheiractivities,tobesubmittedtotheinstructors.Eachgrouphadwell-definedroles,includingthatofaleader.Stu-dentsintheclassloggedthetimespentonthedifferentac-tivitiesrelatedtothecourse.Eachtime-logentryincludedthefollowingfields:date,categoryofactivity,timespentanddetailsoftheactivity.ThecategorywasselectedfromapredefinedsetthatincludedCoding,GroupMeeting,Re-searchandothers(allpreviouslysetbytheinstructors).Thetaskoftheteamleaderwastoprepareaweeklyre-portforthebenefitoftheinstructor,usingthetime-logen-triesoftheindividualteammembersasrawmaterial.Asthestudentswerealreadyusinganon-linesystemtocreatetheirlogs,itwasrelativelystraightforwardtoaugmentthis

applicationtoallowthecreationofleadersummaries.Theaugmentedapplicationprovidedaninterfacethatallowedtheleadertomoreeasilyprepareareportandwasalsoin-strumentedtocollectdataabouttheirbehavior.Instrumen-tationincludedmouseandkeyboardlevelevents(wedonotreportanyanalysisofthesedatainthispaper).

DataCollectionProcess:Following[5],theleaderse-lecteditemsfromadisplayofallitemsfromthestudentreports.Theleaderwasinstructedtogothroughtheitemsandselectasubsetforinclusioninthereport.Selectionwasdonebyhighlightingthe“important”words/phrasesintheitems(describedtotheparticipantasbeingthosewordsorphrasesthatledthemtoselectthatparticularitemforthereport).Theitemswithhighlightedtextautomaticallybe-comethecandidatebriefingitems.1Thehighlightedwordsandphrasessubsequentlyweredesignatedascustomuserfeaturesandwereusedtotrainamodeloftheuser’sselec-tionbehavior.

DataCollected:Wewereabletocollectatotalofcom-plete61group-weeksofdata.Onegroup-weekincludesthetimelogswrittenbythemembersofaparticulargroupandtheassociatedextractivesummaries.Theclassconsistedoftwostages,designandimplementation,lastingabout6and9weeksrespectively.Toprovideconsistentdatafordevel-opment,testingandanalysisofoursystem,weselected3groupsfromthelaterstageoftheclassthatproducedre-portsmostconsistently(thesearedescribedfurtherintheEvaluationsectionbelow).

4LearningSystem

WemodeledourLearningprocessontheonedescribedby[5];thatis,modelswererebuiltonaweeklybasis,usingalltrainingdataavailabletothatpoint(i.e.,fromtheprevi-ousweeks).Thismodelwasthenusedtopredicttheuser’sselectionsinthecurrentweek.Forexample,amodelbuiltonweeks1and2wastestedonweek3.Thenamodelbuiltonweeks,1,2and3wastestedonweek4,andsoon.

Becausethevocabularyvariedsignificantlyfromweektoweek,wetrainedmodelsusingonlythosewords(fea-tures)thatweretobefoundintherawdataforthetargetweek,sinceitwouldnotbemeaningfultotrainmodelsonnon-observedfeatures.

Theresultingmodelisusedtoclassifyeachcandidateitemasbelonginginthesummaryornot.Theconfidenceassignedtothisclassificationwasusedtorankordertherawitemsandthetop5itemsweredesignatedasthe(predicted)summaryforthatweek.Thefollowingsectionsexplainthelearningsysteminmoredetail.

TNI

1

10

3

18

5

15

7

25

9

Group1

NIS8

4

7

5

8.6

2

8.4

6

7.2

22.64.31.32.5

15241824-

Group2ANW

6

8.6

7

5.5

3

14.7

7

7.1

-

ANS2.82.76.11.7

TNI13111216

Group3

NIS

6.9

3

7.5

2

7.7

2

11.7

4

-

31.27.23.3-

weighingmethod-TF(termfrequency),TF.IDF,Salton-Buckley(SB)[12].b)CorpusformeasuringIDF:Foranyword,theinversedocumentfrequencycanbeobtainedbyconsideringeitherthedocumentsinthetrainingsetorthetestsetorboth.ThereforewehavethreedifferentwaysofcalculatingIDF.c)Normalizationschemeforthevariousscoringfunctions:nonormalization,L1andL2.

Featurescoringinthefirstsettingofextractinguni-gramfeaturesFRawisstraightforwardusingtheabovemen-tionedIRparameters(TF,TF.IDForSB).Forcombiningthescoresunderthesecondsettingwiththe‘user-specific’featuresweusedthefollowingequation:

Sf=(1+α)∗Sfbase

(1)

whereαistheweightcontributionfortheuser-specificfea-turesandSfbaseisthebasescore(TForTF.IDForSB).Weempiricallyfixedαto‘1’forthecurrentstudy.

Wetestedtheabovementionedvariationsoffeaturede-scription,featureextractionandfeaturescoringusingfourlearningschemes:NaiveBayes,VotedPerceptron,SupportVectorandLogisticRegression.Intheevent,preliminarytestingindicatedthatSupportVectorandLogisticRegres-sionwerenotsuitedfortheproblemathandandsothesewereeliminatedfromfurtherconsideration.WeusedtheWeka[11]packagefordevelopingthesystem.

4.2Evaluation

ThebaseperformancemetricisRecall,definedintermsoftheitemsrecommendedbythesystemcomparedtotheitemsultimatelyselectedbytheuser.4WejustifythisbynotingthatRecallcanbedirectlylinkedtotheexpectedtime

ideabeingtocaptureuser’spreferencewrtparticularclassesof

NEsi.e.theuserpreferstoselectanitemwhereapersonandorganizationarementionedtogether

3Wealsoexperimentedwithusingjusttheuser-specificfeaturesiniso-lationbutfoundtheselessusefulthanacombinationofallfeatures.

2The

0.70.70.70.60.60.60.50.50.5RecallRecallRecall0.40.40.40.30.30.30.2

User-Combined0.1

RawN-gramsBaseline0

1

2

3

0.2IG-Combined0.1User-CombinedRawN-gramsBaseline01230.2Cross-GroupIG-CombinedUser-CombinedRawN-gramsBaseline1230.10PhasePhasePhase(a)RecallValuesfortheFinalmodelforIndividualUserscomparingtheRawN-gramfeatureswithUser-Combinedfeatures

(b)RecallcomparisonbetweenIn-formationGainselectedfeaturesandUserselectedfeatures(c)RecallcomparisonbetweenIndi-vidualusertrainingandcross-grouptraining

Figure1.Figureshowingthevariousexperiments.

savingsfortheeventualusersofaprospectivesummariza-tionsystembasedontheideasdevelopedinthisstudy.Theobjectivefunctionsthatweusedforselectingthesystemmodel(builtonthebasisRecall)are:

1.Weightedmeanrecall(WMR):ofthesystemacrossallweeks.Theweeksaregivenlinearlyincreasingweights(normalized)whichcapturestheintuitionthattheperfor-manceinthelaterweeksisincreasinglymoreimportantastheyhaveconsecutivelymoretrainingdata.

2.Slopeofthephase-wiseperformancecurve(Slope):Wefirstcalculatethethreephase-wiserecallperformanceval-ues(normalaverageoftherecallvalues)andthencomputetheslopeofthecurveforthesethreepoints.

Notethatthesemetricsareusedasaselectioncriteriononly.ResultsinFigure1arestatedintermsoftheoriginalRecallvaluesaveragedoverthephaseandacrossthethreeusers.Wecomparethesewiththeresultsfortherandombaseline.Therandombaselineiscalculatedbyrandomlyselectingitemsoveralarge(10000)runsofthesystemanddeterminingthemeanperformancevalue.5

5ExperimentandResults

Weselectedforexperimentationthethreegroupsthatmostconsistentlygeneratedcompleteweeklydatasets.Thesegroupshad9,8and8completeweeksofdata.De-tailedstatisticsofthedataareshowninTable1.Sincethe

SelectionMechanism

InformationGain(IG)SingleUser(SU)OverlapIG/SU

Phase11.1

12.1

5.50

5.1

Phase31.617.60.4

因篇幅问题不能全部显示,请点此查看更多更全内容