A Briefing Tool that Learns Individual Report-writing Behavior

2023-11-11 来源：客趣旅游网

ABrieﬁngToolthatLearnsIndividualReport-writingBehavior

MohitKumar

CarnegieMellonUniversity

Pittsburgh,USAmohitkum@cs.cmu.edu

NikeshGarera

JohnsHopkinsUniversity

Baltimore,USAngarera@cs.jhu.eduAlexanderI.Rudnicky∗CarnegieMellonUniversity

Pittsburgh,USAair@cs.cmu.edu

Abstract

Wedescribeabrieﬁngsystemthatlearnstopredictthecontentsofreportsgeneratedbyuserswhocreateperiodic(weekly)reportsaspartoftheirnormalactivity.Wead-dressthequestionwhetherdataderivedfromtheimplicitsupervisionprovidedbyend-usersisrobustenoughtosup-portnotonlymodelparametertuningbutalsoaformoffeaturediscovery.Thesystemwasevaluatedunderrealis-ticconditions,bycollectingdatainaproject-baseduniver-sitycoursewherestudentgroupleadersweretaskedwithpreparingweeklyreportsforthebeneﬁtoftheinstructors,usingthematerialfromindividualstudentreports.

1Introduction

Inthispaperwedescribeapersonalizedlearning-basedapproachtosummarizationthatminimizestheneedforlearning-experttimeandeliminatestheneedforexpert-generatedevaluationmaterialssuchasa“goldstandard”summary,sinceeachuserprovidestheirownstandard.Ofcoursethiscomesatacost,whichistheend-usertimeneededtoteachthesystemhowtoproducesatisfactorysummaries.Wewouldhoweverarguethatend-userinvolve-mentismorelikelytogeneratequalityproductsthatreﬂectbothfunctionalneedsanduserpreferencesandisindeedworththeeffort.

Thecurrentpaperdescribestheapplicationofthisap-proachinthecontextofanengineeringprojectclassinwhichstudentswereexpectedtoproduceweeklysum-mariesoftheirwork.Whileeachstudentproducestheirownlogs,theirteamleaderwasadditionallytaskedwith

query-relevanttextsummarysystembasedoninteractivelearning.Learningisintheformofqueryexpansionandsentencescoringbyclassiﬁcation.[6]haveexploredin-teractivemulti-documentsummarization,wheretheinter-actionwiththeuserwasintermsofgivingtheusercontroloversummaryparameters,supportrapidbrowsingofdocu-mentsetandalternativeformsoforganizinganddisplayingsummaries.Theirapproachof‘contentselection’toiden-tifykeyconceptsinunigrams,bigramsandtrigramsbasedonthelikelihoodratio[4]isdifferentfromourstatisticalanalysisandisofsomeinterest.[13]haveproposedaper-sonalizedsummarizationsystembasedontheuser’sanno-tation.Theyhavepresentedagoodcaseoftheusefulnessofuser’sannotationsinobtainingpersonalizedsummaries.Howevertheirsystemdiffersfromthecurrentoneinsev-eralrespects.Theirscenarioisasingledocumentnewswiresummaryandisverydifferentfromabrieﬁng.Also,theirsystemispurelystatisticalanddoesnotincludetheconceptofahuman-in-the-loopthatimprovesperformance.

[5]describeasummarizationsystemforarecurringweeklyreport-writingtakingplaceinaresearchproject.Theyfoundthatknowledge-engineeredfeaturesleadtothebestperformance,althoughthisperformanceisclosetothatbasedonn-gramfeatures.Giventhis,itwouldbedesir-abletohaveaprocedurethatleverageshumanknowledgetoidentifyhigh-performancefeaturesbutdoesnotrequiretheparticipationofexpertsintheprocess.Wedescribeanapproachtothisproblembelow.

3TargetDomain

Weidentiﬁedadomainthatwhilesimilarinitsreportingstructuretotheonestudiedby[5]differedinsomesignif-icantrespects.Speciﬁcally,thereport-writerswerenotex-periencedresearcherswhowereaskedtogenerateweeklyreportsbutwerestudentstakingacoursethatalreadyhadarequirementforweeklyreportgeneration.Thecoursewasproject-basedandtaughtinauniversityengineeringschool.Thestudents,whoweredividedintogroupsworkingondif-ferentprojectswererequiredtoproduceaweeklyreportoftheiractivities,tobesubmittedtotheinstructors.Eachgrouphadwell-deﬁnedroles,includingthatofaleader.Stu-dentsintheclassloggedthetimespentonthedifferentac-tivitiesrelatedtothecourse.Eachtime-logentryincludedthefollowingﬁelds:date,categoryofactivity,timespentanddetailsoftheactivity.ThecategorywasselectedfromapredeﬁnedsetthatincludedCoding,GroupMeeting,Re-searchandothers(allpreviouslysetbytheinstructors).Thetaskoftheteamleaderwastoprepareaweeklyre-portforthebeneﬁtoftheinstructor,usingthetime-logen-triesoftheindividualteammembersasrawmaterial.Asthestudentswerealreadyusinganon-linesystemtocreatetheirlogs,itwasrelativelystraightforwardtoaugmentthis

applicationtoallowthecreationofleadersummaries.Theaugmentedapplicationprovidedaninterfacethatallowedtheleadertomoreeasilyprepareareportandwasalsoin-strumentedtocollectdataabouttheirbehavior.Instrumen-tationincludedmouseandkeyboardlevelevents(wedonotreportanyanalysisofthesedatainthispaper).

DataCollectionProcess:Following[5],theleaderse-lecteditemsfromadisplayofallitemsfromthestudentreports.Theleaderwasinstructedtogothroughtheitemsandselectasubsetforinclusioninthereport.Selectionwasdonebyhighlightingthe“important”words/phrasesintheitems(describedtotheparticipantasbeingthosewordsorphrasesthatledthemtoselectthatparticularitemforthereport).Theitemswithhighlightedtextautomaticallybe-comethecandidatebrieﬁngitems.1Thehighlightedwordsandphrasessubsequentlyweredesignatedascustomuserfeaturesandwereusedtotrainamodeloftheuser’sselec-tionbehavior.

DataCollected:Wewereabletocollectatotalofcom-plete61group-weeksofdata.Onegroup-weekincludesthetimelogswrittenbythemembersofaparticulargroupandtheassociatedextractivesummaries.Theclassconsistedoftwostages,designandimplementation,lastingabout6and9weeksrespectively.Toprovideconsistentdatafordevel-opment,testingandanalysisofoursystem,weselected3groupsfromthelaterstageoftheclassthatproducedre-portsmostconsistently(thesearedescribedfurtherintheEvaluationsectionbelow).

4LearningSystem

WemodeledourLearningprocessontheonedescribedby[5];thatis,modelswererebuiltonaweeklybasis,usingalltrainingdataavailabletothatpoint(i.e.,fromtheprevi-ousweeks).Thismodelwasthenusedtopredicttheuser’sselectionsinthecurrentweek.Forexample,amodelbuiltonweeks1and2wastestedonweek3.Thenamodelbuiltonweeks,1,2and3wastestedonweek4,andsoon.

Becausethevocabularyvariedsigniﬁcantlyfromweektoweek,wetrainedmodelsusingonlythosewords(fea-tures)thatweretobefoundintherawdataforthetargetweek,sinceitwouldnotbemeaningfultotrainmodelsonnon-observedfeatures.

Theresultingmodelisusedtoclassifyeachcandidateitemasbelonginginthesummaryornot.Theconﬁdenceassignedtothisclassiﬁcationwasusedtorankordertherawitemsandthetop5itemsweredesignatedasthe(predicted)summaryforthatweek.Thefollowingsectionsexplainthelearningsysteminmoredetail.

TNI

Group1

NIS8

8.6

8.4

7.2

22.64.31.32.5

15241824-

Group2ANW

8.6

5.5

14.7

7.1

ANS2.82.76.11.7

TNI13111216

Group3

NIS

6.9

7.5

7.7

11.7

31.27.23.3-

weighingmethod-TF(termfrequency),TF.IDF,Salton-Buckley(SB)[12].b)CorpusformeasuringIDF:Foranyword,theinversedocumentfrequencycanbeobtainedbyconsideringeitherthedocumentsinthetrainingsetorthetestsetorboth.ThereforewehavethreedifferentwaysofcalculatingIDF.c)Normalizationschemeforthevariousscoringfunctions:nonormalization,L1andL2.

Featurescoringintheﬁrstsettingofextractinguni-gramfeaturesFRawisstraightforwardusingtheabovemen-tionedIRparameters(TF,TF.IDForSB).Forcombiningthescoresunderthesecondsettingwiththe‘user-speciﬁc’featuresweusedthefollowingequation:

Sf=(1+α)∗Sfbase

(1)

whereαistheweightcontributionfortheuser-speciﬁcfea-turesandSfbaseisthebasescore(TForTF.IDForSB).Weempiricallyﬁxedαto‘1’forthecurrentstudy.

Wetestedtheabovementionedvariationsoffeaturede-scription,featureextractionandfeaturescoringusingfourlearningschemes:NaiveBayes,VotedPerceptron,SupportVectorandLogisticRegression.Intheevent,preliminarytestingindicatedthatSupportVectorandLogisticRegres-sionwerenotsuitedfortheproblemathandandsothesewereeliminatedfromfurtherconsideration.WeusedtheWeka[11]packagefordevelopingthesystem.

4.2Evaluation

ThebaseperformancemetricisRecall,deﬁnedintermsoftheitemsrecommendedbythesystemcomparedtotheitemsultimatelyselectedbytheuser.4WejustifythisbynotingthatRecallcanbedirectlylinkedtotheexpectedtime

ideabeingtocaptureuser’spreferencewrtparticularclassesof

NEsi.e.theuserpreferstoselectanitemwhereapersonandorganizationarementionedtogether

3Wealsoexperimentedwithusingjusttheuser-speciﬁcfeaturesiniso-lationbutfoundtheselessusefulthanacombinationofallfeatures.

2The

0.70.70.70.60.60.60.50.50.5RecallRecallRecall0.40.40.40.30.30.30.2

User-Combined0.1

RawN-gramsBaseline0

0.2IG-Combined0.1User-CombinedRawN-gramsBaseline01230.2Cross-GroupIG-CombinedUser-CombinedRawN-gramsBaseline1230.10PhasePhasePhase(a)RecallValuesfortheFinalmodelforIndividualUserscomparingtheRawN-gramfeatureswithUser-Combinedfeatures

(b)RecallcomparisonbetweenIn-formationGainselectedfeaturesandUserselectedfeatures(c)RecallcomparisonbetweenIndi-vidualusertrainingandcross-grouptraining

Figure1.Figureshowingthevariousexperiments.

savingsfortheeventualusersofaprospectivesummariza-tionsystembasedontheideasdevelopedinthisstudy.Theobjectivefunctionsthatweusedforselectingthesystemmodel(builtonthebasisRecall)are:

1.Weightedmeanrecall(WMR):ofthesystemacrossallweeks.Theweeksaregivenlinearlyincreasingweights(normalized)whichcapturestheintuitionthattheperfor-manceinthelaterweeksisincreasinglymoreimportantastheyhaveconsecutivelymoretrainingdata.

2.Slopeofthephase-wiseperformancecurve(Slope):Weﬁrstcalculatethethreephase-wiserecallperformanceval-ues(normalaverageoftherecallvalues)andthencomputetheslopeofthecurveforthesethreepoints.

Notethatthesemetricsareusedasaselectioncriteriononly.ResultsinFigure1arestatedintermsoftheoriginalRecallvaluesaveragedoverthephaseandacrossthethreeusers.Wecomparethesewiththeresultsfortherandombaseline.Therandombaselineiscalculatedbyrandomlyselectingitemsoveralarge(10000)runsofthesystemanddeterminingthemeanperformancevalue.5

5ExperimentandResults

Weselectedforexperimentationthethreegroupsthatmostconsistentlygeneratedcompleteweeklydatasets.Thesegroupshad9,8and8completeweeksofdata.De-tailedstatisticsofthedataareshowninTable1.Sincethe

SelectionMechanism

InformationGain(IG)SingleUser(SU)OverlapIG/SU

Phase11.1

12.1

5.50

5.1

Phase31.617.60.4

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文

全部栏目

A Briefing Tool that Learns Individual Report-writing Behavior